Commit graph

688 commits

Author SHA1 Message Date
Siddharth Balyan
73cd8622f9
feat(billing): /billing terminal billing — interactive TUI + CLI client (#45449)
* feat(billing): nous_billing http client + BillingState core (phase 2b)

Phase 2b terminal-billing client foundation:
- hermes_cli/nous_billing.py: typed client for the 4 /api/billing/* endpoints
  (state/charge/poll/auto-top-up). Raises typed errors (BillingScopeRequired,
  BillingRateLimited, BillingAuthError) mapped from the live-verified contract;
  fail-open is the caller's job. Idempotency-Key enforced client-side.
- agent/billing_view.py: surface-agnostic BillingState core + Decimal money
  parsing (server emits decimal strings, not 2dp), fail-open builder,
  idempotency-key gen, custom-amount validation.
- 51 unit tests (decimal parse/format, payload tiering, error->exception
  matrix, fail-open, amount validation).

Plan: docs/plans/2026-06-13-001-phase-2b-terminal-billing-tui-plan.md

* feat(billing): billing:manage scope + lazy step-up re-auth (phase 2b)

- NOUS_BILLING_MANAGE_SCOPE constant.
- nous_token_has_billing_scope(): split-based scope check (no false-positive
  substring match).
- step_up_nous_billing_scope(): re-runs the device flow requesting
  billing:manage, reusing the held credential's portal/inference URLs + client_id
  (so a preview stays a preview), persists like _login_nous but WITHOUT the model
  picker. Returns True iff the minted token carries the scope (False when NAS
  silently downscopes a non-admin / unticked grant).

Lazy step-up (plan D-A): normal login path unchanged; 403 insufficient_scope
from a billing call triggers this. 7 unit tests.

* feat(billing): billing JSON-RPC methods for the TUI (phase 2b)

billing.state / charge / charge_status / auto_reload / step_up in
tui_gateway/server.py. Return STRUCTURED success envelopes (result.ok +
result.error=<code>) rather than JSON-RPC-level errors, so the Ink rpc() promise
always resolves and the TUI branches on the typed billing error code
(insufficient_scope, rate_limited, no_payment_method, …) to render the right
affordance. Money serialized as decimal STRINGS + display strings. charge mints
+ echoes an idempotency_key for retry reuse. 16 unit tests.

* feat(billing): /billing CLI handler + command registry (phase 2b)

- CommandDef("billing", subcommands=buy|auto-reload|limit), added to
  _SLACK_VIA_HERMES_ONLY so it routes via /hermes on Slack (keeps the 50-cap
  parity test green, same as /credits).
- cli.py::_show_billing + screen helpers: all 5 screens (overview, buy→confirm→
  poll, auto-reload, monthly-limit read-only). Reuses _prompt_text_input_modal /
  _prompt_text_input (D-C). Non-interactive (_app is None) renders text + portal
  deep-link, never prompts (R7). Decimal money end-to-end. 2s/5-min cancellable
  poll loop; 429/503 = retry not failure; settled = ledger truth. Lazy step-up on
  403 insufficient_scope. no_payment_method treated as mainline funnel-to-portal.
- 6 CLI tests; 156 command tests (incl. Slack/Telegram parity) green.

* feat(billing): /billing Ink TUI screens + tests (phase 2b)

- ui-tui/src/app/slash/commands/billing.ts: /billing TUI command covering all 5
  screens — overview (text), buy <amt> → ConfirmReq → charge → non-blocking 2s/
  5-min poll loop → settled/failed/timeout branches, auto-reload <below> <to> →
  ConfirmReq → PATCH, limit (read-only). Reuses the existing ConfirmReq overlay
  (D-C) — no bespoke component. Typed-error envelope branching: insufficient_scope
  arms the lazy step-up confirm; no_payment_method/rate_limited/cap funnel to
  portal. Client-side amount validation mirrors the server (bounds + 2dp).
- gatewayTypes.ts: Billing* response interfaces.
- registry.ts: register billingCommands.
- billingCommand.test.ts: 12 vitest cases (overview/gating/buy-confirm-poll-
  settled/no_payment_method/step-up/limit/auto-reload/validation).

TUI build green; 12/12 vitest pass; slash tests pass once @hermes/ink is built.

* docs(billing): scrub private cross-repo references

NAS is a private repo — remove all references to it from the public PR:
- drop the cross-repo planning doc (planning scaffolding, not a deliverable;
  the PR description documents the design)
- replace 'NAS' / 'PR #412 preview' mentions in code + test comments with
  generic 'the server' / 'a preview deployment'

* docs(billing): scrub final NAS reference in step-up docstring

* docs(billing): drop dangling plan-doc refs

The phase-2b plan doc was removed in the cross-repo scrub (300afcc0b)
but two module docstrings still pointed at it. Drop the dead refs.

* feat(billing): interactive /billing overlay + step-up UX, portal-URL & token fixes

Adds the interactive /billing TUI overlay and hardens the terminal-billing
client across CLI and TUI.

- TUI: full /billing overlay state machine (overview to buy to confirm,
  auto-reload, read-only monthly limit) reusing the existing confirm overlay.
- Step-up: surface the verification link in-transcript and open the browser
  via the TUI's own opener (the device flow runs in the headless gateway, so a
  printed URL was being dropped); run the step-up handler off the main loop and
  emit the link as an out-of-band event so the gateway stays responsive.
- Step-up copy is scope-accurate ("Billing permission granted") and re-checks
  /state so it never claims "enabled" when the org kill-switch is still off.
- Portal deep-links resolve to absolute URLs against the active portal base
  (the server emits them relative) - fixes a bare "/billing?topup=open" link.
- Billing calls refresh an expired access token via the stored refresh token
  instead of reporting a false "not logged in".
- Optimistic funnel: advise "set up a saved card on the portal" up front when
  no card is on file (advisory, not a hard gate).
- Token resolution is cached briefly so the 2s charge poll loop stops
  re-locking + re-reading the auth store on every tick; 401 re-resolves fresh.
- Remove the temporary demo-mode shims.

Validation: 87 Python billing tests, 88 TS tests (billing command + gateway
event handler), tsc clean, ink + ui-tui builds green.

* docs(billing): add /billing TUI screenshots for PR

* fix(cli): guard _last_invalidate on bare instances; update stale prompt-fallback test

The UI-invalidate throttle read self._last_invalidate unconditionally, which
raised AttributeError on HermesCLI instances built without __init__ (the
thread-safety test's object.__new__ shell). Guard the read with getattr.

The off-main-thread branch of _prompt_text_input was changed (#23185) to cancel
cleanly to None instead of falling back to a bare input() that would hang on the
slash-worker thread; the test still asserted the old direct-input fallback.
Update it to assert the current intended behavior: returns None, calls neither
run_in_terminal nor input(), and does not hang.
2026-06-19 01:53:32 +05:30
xxxigm
58ad6942d9
fix(tui): don't make Enter swallow trailing-space-only slash completions (#48425)
* fix(tui): don't make Enter swallow trailing-space-only slash completions

Submitting a slash command in the TUI took three Enter presses: one to
complete the name (/ex → /exit), a second that only appended the trailing
space the gateway adds to keep the classic-CLI prompt_toolkit dropdown open
(/exit → "/exit "), and a third to actually submit.

The composer's submit handler accepted the highlighted completion whenever
applying it changed the input at all, so the whitespace-only delta ate an
extra keypress. Treat a completion whose only change is trailing whitespace
on an already-complete token as "already complete" and fall through to
submit. Partial-name and argument completions (a real token change) still
accept on Enter as before.

The replace/accept logic is extracted into pure helpers (applyCompletion,
completionToApplyOnSubmit) in domain/slash.ts.

* test(tui): cover Enter/completion trailing-space behavior and isolate poller queue

- completionApply.test.ts asserts completionToApplyOnSubmit accepts real
  token completions (partial command name, argument) but returns null for a
  trailing-space-only delta on an already-complete command, so Enter submits
  instead of needing extra presses.
- test_notification_poller_delivers_completion / _skips_consumed previously
  shared the process-global process_registry.completion_queue. Their events
  carry no session_key, so a leaked/concurrent poller could dequeue and
  dispatch them to a fixture agent without run_conversation, flaking CI
  ("AttributeError: '_FakeAgent' object has no attribute 'run_conversation'").
  Isolate the queue per test (fresh queue.Queue via monkeypatch), matching the
  sibling poller tests that already do this.
2026-06-18 11:04:59 -05:00
Teknium
8abdab24c9
fix(tui): MCP headline counts connected servers, not disabled ones (#48402)
The TUI banner footer used the raw `info.mcp_servers.length`, so a
configured-but-disabled server (e.g. `linear`) was counted alongside
connected ones. With a disabled `linear` and a connected `nous-support`,
the TUI reported "2 MCP" while the classic CLI correctly reported "1 MCP"
(`mcp_connected = sum(1 for s in mcp_status if s["connected"])` in
hermes_cli/banner.py).

The collapse toggle even labels the count "connected", which was wrong
for the same reason.

Count connected servers for both the toggle and the footer segment, and
drop the `· N MCP` segment entirely when none are connected (matching the
classic banner, which only appends it when the count is > 0). The
expandable MCP section still lists every configured server, including
disabled ones.

Invariant test renders SessionPanel and asserts the headline equals the
connected count, never the configured total.
2026-06-18 05:41:19 -07:00
FT_IOxCS
92a456f711 fix(cli,deps): clear esbuild audit loop
Upgrade the Vite/esbuild surfaces that kept web, ui-tui, and the bootstrap installer on vulnerable esbuild versions, regenerate the root lockfile, and preserve intentional package+lock dependency edits during update lockfile cleanup.
2026-06-15 06:18:27 -07:00
Siddharth Balyan
7ba5df0d52
feat(billing): /credits command — balance + portal top-up handoff (#44776)
* feat(billing): /usage → portal top-up browser handoff

Add the terminal side of the billing slice (phase 2a): start a top-up by
throwing the user to the portal billing page with the top-up modal open. The
terminal does not confirm, poll, or track payment — checkout completes in the
browser and the next /usage shows the new balance.

- nous_account.py: parse organisation.slug/name from /api/oauth/account into
  NousPortalAccountInfo; add nous_portal_topup_url() building the org-pinned
  {base}/orgs/{slug}/billing?topup=open with a null-slug fallback to the legacy
  {base}/billing?topup=open (never /orgs/None/...).
- portal_cli.py: 'hermes portal topup' — fresh account fetch, identity line
  (Topping up as <email> / org <name>), browser open with printed-URL fallback,
  no-wait closing copy. No polling/confirmation (deferred to 2b).
- account_usage.py: the shared /usage credits block now links the org-pinned
  top-up URL (auto-opens the modal) + points to the command.

Depends on NAS #409 (organisation.slug/name + ?topup=open). Do not merge until
that is live on the target env; until then /api/oauth/account returns
organisation: { id } only and the URL falls back to legacy.

* feat(billing): /credits command for balance + top-up handoff

Replace the standalone `hermes portal topup` subcommand with an in-session
/credits slash command — a focused money surface (balance in, top-up out) that
works in the CLI, TUI, and every messaging platform from one registry entry.

- commands.py: register /credits (Info category). Slack is at its 50-slash cap,
  so /credits is routed via /hermes credits on Slack only (new
  _SLACK_VIA_HERMES_ONLY set) to avoid clamping a canonical command off the
  native list and breaking Telegram parity; native everywhere else.
- account_usage.py: build_credits_view() — one portal fetch → balance lines +
  identity line + org-pinned top-up URL + depleted flag, consumed by all
  surfaces. Reuses the same snapshot/URL builder as /usage so numbers match.
- cli.py: _show_credits() — balance block + identity line + 3-button panel
  (Open top-up / Copy link / Cancel) via the existing prompt_toolkit modal.
  ASK, never auto-launch; headless falls back to printing the URL.
- gateway/slash_commands.py: _handle_credits_command() — renders the block +
  tappable top-up URL + no-wait copy; works on button and plain-text platforms.
- /usage credits line now points to /credits.
- Retire `hermes portal topup` (portal_cli.py back to baseline); the engine
  (slug/name parse + nous_portal_topup_url) stays as the shared core.

No polling, no payment confirmation (billing phase 2a). Depends on NAS #409.

* fix(credits): /credits works in the TUI slash-worker (non-interactive)

In the TUI, /credits runs in the slash-worker subprocess where there is no
live prompt_toolkit app and stdin is the JSON-RPC pipe. _show_credits called
the 3-button modal unconditionally, which fell back to reading stdin →
exception → slash.exec rejected → the command produced no output (only the
pre-existing 'Credit access paused' banner showed).

- _show_credits: when self._app is None (TUI worker / piped / non-interactive),
  render the text variant — balance block + tappable top-up URL + no-wait line,
  same affordance as the messaging surfaces — and skip the modal entirely. The
  3-button panel still renders in the interactive CLI.
- Depleted banner copy: 'run /usage for balance' → 'run /credits to top up'
  now that /credits is the dedicated money surface (+ tests).
- Regression tests: _show_credits with self._app=None renders text and never
  invokes the modal; logged-out path.

* feat(tui): credits.view RPC for the /credits tappable top-up button

Add a credits.view JSON-RPC method returning the structured CreditsView
(logged_in, balance_lines, identity_line, topup_url, depleted) so the TUI can
render a clickable <Link> top-up button instead of plain text. Account-
independent (portal fetch gated on a logged-in Nous account), fail-open to
{logged_in: false} on any hiccup. Mirrors session.usage's credits-block pattern.

Frontend (TUI-local /credits command + Ink component) lands separately.

* feat(tui): /credits command with keyboard-driven top-up confirm

TUI-local /credits: fetches the structured balance via the credits.view RPC,
prints the balance + identity + top-up URL, then arms the EXISTING confirm
overlay (Enter = open top-up in browser via openExternalUrl, Esc = cancel).
Reuses ConfirmReq — no new overlay component/state/input handler. Headless
(openExternalUrl returns false) falls back to printing the URL.

- gatewayTypes.ts: CreditsViewResponse.
- commands/credits.ts: the command (mirrors /status's rpc+guarded pattern).
- registry.ts: register creditsCommands.
- test: balance+overlay armed, headless fallback, no-url, logged-out (4 cases).

Matches the CLI /credits 'Enter to open' affordance. Phase 2a: no polling.
2026-06-12 08:51:10 +00:00
teknium1
b3f5e17bb9 fix(tui): wrap long approval commands in the Ink overlay
Sibling site of the CLI approval-panel fix: the TUI ApprovalPrompt
rendered each command line with wrap="truncate-end", so a long
single-line command lost its tail at terminal width. Wrap to the
panel width via wrapAnsi before applying the 10-line preview cap.
2026-06-11 23:05:08 -07:00
brooklyn!
c6007e5c1a
Merge pull request #44534 from NousResearch/bb/approval-allow-permanent
fix(approval): carry allow_permanent to TUI + desktop approval prompts
2026-06-11 18:49:58 -05:00
Austin Pickett
e2145a5c9c
fix(ui-tui): stabilize embedded dashboard chat gateway (#44528)
Cherry-picked from #39840 by @flyinhigh and rebased cleanly on main.

- Defer config fetch in createGatewayEventHandler until gateway.ready to
  avoid render-phase RPC that can mutate transcript state and trigger
  React error 301 in embedded dashboard PTYs.
- Use undici WebSocket fallback when globalThis.WebSocket is unavailable
  (Node attach mode and sidecar mirror sockets).
- Add regression tests for both fixes.

Co-authored-by: flyinhigh <flyinhigh@users.noreply.github.com>
2026-06-11 19:47:53 -04:00
Brooklyn Nicholson
55a18e6860 chore(approval): tighten allow_permanent comments + DRY the no-always opt set
Collapse the verbose multi-line rationale comments across the TUI/desktop/
backend approval surfaces into single-line "why" notes, and derive
APPROVAL_OPTS_NO_ALWAYS from APPROVAL_OPTS instead of re-listing it.
No behavior change.
2026-06-11 18:42:59 -05:00
Brooklyn Nicholson
81436e143e fix(approval): carry allow_permanent to TUI + desktop approval prompts
When a tirith content-security warning is present the approval backend
forces allow_permanent=False and silently downgrades an "always" choice to
session scope (the persistence loop in check_all_command_guards only honors
"always" → permanent when no tirith finding exists). But the gateway notify
payload that drives the TUI and the Electron desktop app never carried that
flag, so both surfaces always rendered "Always allow" — offering a permanent
allow the backend would quietly refuse to persist.

Plumb allow_permanent end-to-end:
- tools/approval.py: include `allow_permanent: not has_tirith` in the gateway
  approval_data the notify callback emits as `approval.request`.
- ui-tui: thread `allowPermanent` through the event handler, gateway types,
  and ApprovalReq; ApprovalPrompt drops the "always" option (and renumbers the
  quick-pick keys) when it's false.
- apps/desktop: thread `allow_permanent` through the gateway payload type, the
  per-session approval store, and the inline ApprovalBar, which now hides the
  "Always allow…" dropdown item when permanent allow is disallowed — reusing
  the existing DropdownMenu / confirm-Dialog UI.

The desktop/TUI render path for approvals already landed in #38578 (the root
cause of approvals not surfacing in the GUI); this completes the salvage of
#37856 by carrying allow_permanent across both surfaces. #37856's original
thread-local _block() approach is dropped: desktop/TUI approvals resolve via
approval.respond → resolve_gateway_approval (the per-session queue), not the
_block()/request_id correlation, so a worker-thread callback waiting on _block
would never be released by the real UI.

Tests: gateway notify payload carries allow_permanent (True without tirith,
False with a tirith warning); ui-tui approvalAction reduced option set +
event-handler allowPermanent propagation; desktop store round-trip + the
ApprovalBar showing/hiding "Always allow".

Supersedes #37856
Closes #37812

Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
2026-06-11 18:23:59 -05:00
ethernet
e080365a7a fix(tui): new weird typeerror 2026-06-11 15:36:39 -04:00
ethernet
5e5308d34d fix(node): fix @types/node version
TODO lock to a specific node/npm version.
this is a fix for a diff between 10 and 11.
2026-06-11 15:36:39 -04:00
helix4u
e71d746820 fix(mcp): avoid false failed startup status 2026-06-11 09:01:52 -07:00
Teknium
8972a151a4
feat(cli,tui): show time since last final agent response on the status bar (#44265)
Adds an idle clock to the context/status bar in both the prompt_toolkit CLI
and the Ink TUI: once a turn completes, a dim '✓ <elapsed>' segment shows how
long the session has been idle since the last final agent response. Hidden
while a turn is live (the per-prompt elapsed timer covers that) and before
the first turn completes.

- cli.py: track _last_turn_finished_at when the agent thread exits, surface
  it via _format_idle_since() in the snapshot, render in both the wide
  fragments path and the plain-text fallback.
- ui-tui: stamp lastTurnEndedAt when busy flips false after a live turn,
  thread it through appStatus -> StatusRule, render via a ticking IdleSince
  segment sharing the duration breakpoint/width budget.
2026-06-11 06:06:19 -07:00
ethernet
90f4b3040d change(tooling): remove react-compiler eslint, update concurrently
concurrently 9 had a critical vuln dependency,
react-compiler eslint plugin is built into react-hooks eslint plugin as
of https://react.dev/blog/2025/10/07/react-compiler-1
2026-06-10 11:59:34 -04:00
ethernet
3bfbb3f2a0 change(tooling): typecheck in CI, update ts to 6
fix(ui-tui): fix ts 6 real type errors

change(tooling): use new node everywhere
2026-06-10 11:59:34 -04:00
Robin Fernandes
af978ecb17 fix(model): require confirmation for expensive model selections
Rebased onto current main and re-ported across the restructured
surfaces: model flows now thread confirm_provider/base_url/api_key
through hermes_cli/model_setup_flows.py, the Discord picker lives in
plugins/platforms/discord/adapter.py, and the web dashboard picker
applies chat-mode switches via config.set so the expensive-model
confirmation can ride the response.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 00:24:06 -07:00
Ben Barclay
d33965396e
feat(tui): include session name in the terminal titlebar (#43188)
The terminal/console titlebar was composed from status marker + model +
cwd only; the session's (auto-)title never appeared, even though the TUI
already knows it.

Change the format to `<marker> <session name> · <model> · <cwd>`, with the
session name and cwd each omitted when absent so single-segment titles stay
clean. The current session's live title is pulled from the existing
session.active_list poll (which already carries each session's current flag
and title), so there's no extra round-trip; UiState gains a sessionTitle
field updated only when it actually changes, preserving the existing
idle-flicker guard.

Extract the join logic into a pure composeTabTitle() helper in domain/paths
and cover its edge cases (name omitted, cwd omitted, whitespace-only name,
marker-only fallback, truncation, boundary length) in paths.test.ts.
2026-06-10 11:24:01 +10:00
Austin Pickett
52f7e24a74 feat(tui): interactive Plugins Hub overlay for enable/disable
The TUI had no way to toggle plugins — `/plugins` only printed a static
list, and the classic `hermes plugins` picker is curses-based and can't
run inside the Ink UI. Users had to drop to a separate shell and run
`hermes plugins enable/disable`.

Add a PluginsHub overlay modeled on the existing SkillsHub:

- New gateway RPC `plugins.manage` (list + toggle) backed by the same
  disk-discovery + dashboard_set_agent_plugin_enabled primitives the CLI
  and dashboard already use, so all three surfaces agree on state. The
  toggle path also wires the plugin's toolset into platform_toolsets.
- `/plugins` with no arg opens the hub; any subcommand still falls
  through to the text slash worker for CLI parity.
- pluginsHub overlay state threaded through overlayStore / interfaces /
  useInputHandlers (Esc closes) / appOverlays (renders the FloatBox);
  preserved across turn teardown like other user-toggled overlays.
- Hub UI: arrow/number select, Enter/Space toggles live, Tab switches
  user-only vs all (bundled) scope, shows ✓/✗/○ activation glyphs.

plugins.manage added to _LONG_HANDLERS (disk + config I/O).
2026-06-09 10:50:13 -07:00
brooklyn!
1e5ff4a577
fix(hermes-ink): disable mouse tracking on raw-mode teardown to stop SGR leak (#42527)
The raw-mode teardown path (rawModeEnabledCount -> 0) disabled
modifyOtherKeys, kitty keyboard, focus reporting, and bracketed paste,
then dropped raw mode and detached the readable listener -- but left DEC
mouse tracking (1000/1002/1003/1006) asserted. With raw mode off and no
reader attached, the terminal falls back to cooked-mode echo, so every
mouse move emits a hover report (DEC 1003) that prints as literal text:
a flood of '35;col;row M' shards over the prompt in a long session.

handleSuspend() already guards against exactly this (it writes
DISABLE_MOUSE_TRACKING before SIGSTOP); the ordinary teardown path
missed the same guard. Add DISABLE_MOUSE_TRACKING to the teardown, and
re-assert tracking on raw-mode re-entry (via the Ink instance's
reassertTerminalModes, which is gated on altScreenActive and idempotent)
so a transient drop->re-add round-trips cleanly instead of silently
leaving the mouse dead.

Adds a regression test driving a real Ink mount: the last raw-mode
consumer detaching must emit DISABLE_MOUSE_TRACKING.

Reported via a community bug report.
2026-06-08 21:31:06 -05:00
Teknium
fd1e7c2bc3
fix(tui): install the process.on('exit') terminal-mode backstop (#42165)
#19194's fix added process.exit(0) to die()/dieWithCode() with a comment
relying on a process.on('exit') handler in entry.tsx that resets terminal
modes — but that handler was never installed. So /quit, Ctrl+C, Ctrl+D and
every process.exit() path left DEC mouse tracking (?1000/1002/1003/1006)
armed in the parent shell. The terminal then kept emitting mouse reports
into stdin — read as keystrokes by the shell or a freshly relaunched TUI —
surfacing as ...;...M garbage in the input box.

Install the missing handler. 'exit' fires once on real termination and runs
synchronous code only; resetTerminalModes() writes via writeSync, so the
disable sequence lands before the process is gone.

Fixes #28419
2026-06-08 08:21:19 -07:00
teknium1
00c46b8ff9 test(tui): cover heapdump opt-in gate + retention; add AUTHOR_MAP
On-disk vitest coverage for the auto-heapdump disk-safety guard: opt-in
gating (suppressed diagnostics-only path), truthy-spelling acceptance,
manual-trigger passthrough, and the retention prune. Test approach
adapted from #21780 (briandevans) and #21822 (LeonSGP43), reconciled to
the merged gate semantics. Maps alarcritty into AUTHOR_MAP for CI.
2026-06-08 02:20:49 -07:00
alarcritty
8ae0d054f4 fix(tui): guard automatic heap dumps against disk fill
Automatic heap dumps from the TUI memory monitor could write multi-GiB
  .heapsnapshot files on every threshold cross, growing ~/.hermes/heapdumps
  to tens of GiB. Add four layered safeguards:

  - Gate auto-high/auto-critical snapshots behind HERMES_AUTO_HEAPDUMP=1;
    manual dumps remain unchanged.
  - Always write the lightweight diagnostics JSON sidecar so users still
    get an actionable artifact when the snapshot is suppressed.
  - Cap total bytes in the dump dir (HERMES_HEAPDUMP_MAX_BYTES, default
    2 GiB), evicting oldest first, retaining the newest.
  - Add a cooldown between auto dumps (HERMES_AUTO_HEAPDUMP_COOLDOWN_MS,
    default 10 min) so an oscillating heap can't re-trigger.

  Closes #21767
2026-06-08 02:20:49 -07:00
Teknium
4ce9caed04
fix(tui): type execFileNoThrow stdio/ChildProcess and make memoryMonitor critical test heap-independent (#40612)
Salvaged from #40415; re-verified on main, tightened, tested.

Co-authored-by: psionic73 <psionic73@users.noreply.github.com>
2026-06-07 18:23:42 -07:00
Teknium
89929553b4
fix(tui): only patch liveSessionCount when it changes to stop idle re-render flicker (#40572)
Closes #40369.

Salvaged from #40502; cleaned up, re-verified against main, tests added.

Co-authored-by: r266-tech <r266-tech@users.noreply.github.com>
2026-06-06 18:42:19 -07:00
Siddharth Balyan
c79b6f23e6
fix(credits): let the "grant spent" notice yield on the next prompt (#40367)
credits.grant_spent is a one-time "your monthly grant is used up, you're now on
top-up" heads-up, but it was sticky — it camped the TUI status bar until the grant
refilled, so a user with healthy top-up saw "Grant spent · $990 top-up left"
indefinitely. Treat it like the usage-band notice: flash once, then clear on the
next prompt (startMessage). Depletion stays sticky (you actually can't make
requests). The Python `active` latch keeps the key, so it won't re-fire next turn.
2026-06-06 08:02:41 +00:00
Siddharth Balyan
fcb1944b4f
feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011)
Some checks are pending
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix Lockfile Fix / auto-fix-main (push) Waiting to run
Nix Lockfile Fix / fix (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
* feat(tui): HERMES_DEV_CREDITS live-spend dev readout (L0 tracer for usage-aware credits)

L0 of the usage-aware-credits feature: a dev-only, env-gated tracer that
exercises the real header -> CreditsState -> TUI pipe end-to-end behind
HERMES_DEV_CREDITS, de-risking the L1/L5 build before the notice policy exists.

- agent/credits_tracker.py: CreditsState + parse_credits_headers (headers are
  strings -> paid_access via == "true", never bool(); retain-last-known; only
  subscription_micros may be negative; *_usd kept verbatim).
- run_agent.py: _capture_credits / get_credits_state / get_credits_spent_micros,
  session-start baseline latch, + dev-gated "credits" capture log.
- agent/chat_completion_helpers.py: capture on the streaming response.
- agent/agent_init.py: init _credits_state + _credits_session_start_micros.
- tui_gateway/server.py: _get_usage emits dev_credits_spent_micros only when flagged.
- ui-tui appChrome.tsx / types.ts: cents delta status segment + "(dev credits)" banner.

Off by default; silent for normal users. Validated live against staging
(capture log delta matches the TUI segment). Throwaway consumer (readout/log/
banner); credits_tracker + the capture plumbing are the real feature foundation.

* test(credits): lock parser under 9-state matrix + harden validation (L2)

Add tests/agent/test_credits_tracker.py with 92 tests covering the 9-state
matrix (healthy, sub_90pct, grant_exhausted, purchased_only, tool_pool_free,
depleted, debt, missing, no_org) plus validation edge cases: version strict==1
with warn-once latch for v>1, bool-string trap (paid_access/tool_pool_gated_off
== "true"/"false", never bool()), half-pair subscription limit treated as
both-absent while parse succeeds, USD regex ^-?\d+\.\d{2}$, non-int micros
→ None, negative non-subscription micros → None, as_of_ms junk → None, zero
limit ZeroDivision guard.

Harden agent/credits_tracker.py to match the spec:
- Add tool_pool_micros/tool_pool_gated_off/from_header fields to CreditsState
- Add depleted property (== not paid_access, never remaining==0)
- Change used_fraction guard to key off subscription_limit_micros (the actual
  denominator) not denominator_kind (metadata)
- Replace fail-soft _safe_int with a sentinel-returning variant; full validation
  now returns None on any malformed field rather than silently defaulting
- Add module-level warn-once latch for version > 1
- Add USD regex validation; add denominator_kind allow-list check
- Parse x-nous-tool-pool-* prefix headers (not x-nous-credits-tool-pool-*)

* feat(credits): notice spine — AgentNotice + notice_callback/notice_clear_callback + TUI binding (L1)

L1 of usage-aware credits: the driver-agnostic notice delivery spine that L4's
policy will fire through and L5's TUI render will consume.

- agent/credits_tracker.py: AgentNotice dataclass (text/level/kind/ttl_ms/key/id;
  kind defaults "sticky", kept TTL-expressive for a future config seam).
- run_agent.py: AIAgent gains notice_callback + notice_clear_callback slots and
  _emit_notice / _emit_notice_clear emitters (swallow all callback errors — a
  notice must never break the agent loop; no-op when unbound).
- agent/agent_init.py: thread both callbacks through init_agent.
- tui_gateway/server.py: bind both in _agent_cbs → notification.show / notification.clear
  WS events (snake_case payload, matching the existing gateway-event convention).
- ui-tui/src/gatewayTypes.ts: notification.show / notification.clear arms on GatewayEvent.
- tests/run_agent/test_notice_spine.py: 15 tests (emitter fire + fail-open + no-op,
  signature threading, TUI binding payload shape).

Messaging push is out of v1 (binds neither callback). CLI binding + the TUI render/
decode land with L4 (firing) and L5 (render) so turn-end flush is wired correctly.

* feat(credits): threshold reconciliation policy + tests (L4.1)

* feat(credits): wire threshold policy into capture + latch (L4.2)

After a fresh header parse, _capture_credits runs evaluate_credits_notices against
the agent's _credits_latch and emits the result — clears first, then shows (so a
recovered depletion clears before the "restored" success lands, and depleted wins
the latest-wins slot). Gated on a bound notice_callback: messaging (no callbacks)
still caches state for /usage but runs no policy. Parse stays fail-open (miss →
keep last-known); the eval/emit path warns on failure rather than swallowing, so a
depletion-notice bug can't vanish silently.

- run_agent.py: _capture_credits split into parse (swallow→miss) + policy (warn);
  latch lazy-guarded (object.__new__ safety).
- agent/agent_init.py: init agent._credits_latch = {"active": set(), "seen_below_90": False}.

* feat(tui): render credits notices in the status bar (L5, Strategy B)

The TUI now renders the notification.show / notification.clear gateway events the
agent emits — a level-colored notice overrides the status/verb slot when not busy.

- Notice state machine on turnController (pendingNotice + dedicated noticeTimer +
  show/clear/applyNotice/flushPendingNotice/clearNoticeState). createGatewayEventHandler
  decodes the events and delegates.
- Render priority busy > notice > status (appChrome StatusRule); notice text rendered
  verbatim (its glyph comes from the policy), shrinkable so it never clips model│ctx;
  dev-credits banner + Δ segment preserved. UiState.notice is snake_case (matches wire).
- Busy-wins: a notice arriving mid-turn is held and flushed at the THREE turn-end sites
  (recordMessageComplete / interruptTurn / recordError) — never idle(), which reset()
  also calls (would leak across sessions); reset() clears instead.
- Dedicated noticeTimer (never statusTimer); TTL starts on visibility with an id-guard;
  latest-wins cancels the prior timer; clear is key-matched (no-op on mismatch); a sticky
  survives a turn (flush no-ops with no pending); session reset clears (no cross-session leak).
- 20 tests (handler/turnController logic incl. R3-C2 timer isolation + render priority).

* feat(credits): cold-start seed for new Nous sessions (L3)

A genuinely-new Nous session has no inference header yet, so seed credits state from
the authoritative GET /api/oauth/account snapshot at session start (in the new-session
branch of _restore_or_build_system_prompt — inline, since the on_session_start plugin
hook gets no agent reference). The seed runs the shared notice policy, so a session that
opens already depleted warns IMMEDIATELY rather than only after the first turn.

- Maps the nested account fields (paid_service_access → paid_access; total_usable /
  subscription / purchased on paid_service_access_info; rollover on subscription), each
  None-guarded; float dollars → micros via round(d*1e6), *_usd left "" (render formats
  from micros — never synthesize a verbatim usd from a float).
- Magnitudes-only: no monthlyCredits on the endpoint → subscription_limit_* unset →
  used_fraction None → no warn90 from the seed (% only once a header lands, per D-E).
- Provider-guarded to Nous; fail-open (any error leaves _credits_state None, never
  blocks startup); paid_access unknown ⇒ True (never falsely depleted).
- run_agent.py: extracted the warm-path policy/emit block into a shared
  _emit_credits_notices() so capture and the seed fire notices identically.

* feat(credits): /usage Nous credits magnitudes view + recovery trigger (L6)

Add Nous credit dollar magnitudes to /usage (subscription / top-up / total
+ rollover + renewal + portal CTA), magnitudes-only per v1 (no % until the
account endpoint exposes a denominator). Reuses the existing account-usage
render machinery via a new pure build_nous_credits_snapshot() that maps a
NousPortalAccountInfo to an AccountUsageSnapshot; no nous branch is added to
fetch_account_usage (keeps the per-provider boundary intact).

CLI /usage also doubles as a depletion-recovery trigger: a force_fresh
account fetch, kept in a SEPARATE local so it never clobbers the
header-sourced agent._credits_state (which alone carries used_fraction). If
paid access recovered while credits.depleted is latched and a notice
consumer is bound, it reuses agent._emit_credits_notices() to clear it.
Gateway /usage displays magnitudes only — messaging binds no notice
consumer, so it performs no recovery emit.

Fail-open throughout: any portal hiccup leaves /usage unaffected.

* refactor(credits): dedupe HERMES_DEV_CREDITS flag parse via shared helpers

The dev-flag truthy check was inlined in three places. Replace with the shared
utils.is_truthy_value (run_agent.py, tui_gateway/server.py — also drops a
redundant inline `import os`) and a hoisted DEV_CREDITS_MODE export in
ui-tui/src/config/env.ts (consumed by appChrome, which also stops recomputing the
env check on every render). Behaviour-preserving; identical truthy set.

* fix(credits): cut dead /usage recovery trigger + bound portal fetches (L6 review)

Adversarial review found the /usage depletion-recovery trigger dead AND broken:
the CLI binds no notice_clear_callback, the TUI runs /usage in a separate
slash-worker subprocess (its own agent/latch), and the no-clobber rule made it
evaluate stale paid_access anyway. Recovery already happens on the next inference
(warm path), so the trigger was redundant — remove it and stop the depleted
notice over-promising.

- cli.py: remove the dead recovery block; bound the /usage portal fetch with a
  10s wall-clock timeout (ThreadPoolExecutor) like the per-provider fetch —
  urllib's per-socket timeout is not a wall-clock guarantee.
- agent/credits_tracker.py: reword the depleted CTA to "run /usage for balance"
  (no false recovery promise; /usage shows fresh magnitudes, sticky clears next turn).
- agent/conversation_loop.py: same wall-clock timeout on the cold-start seed fetch
  so a stalled portal can't hang session startup; tidy its time import.

* chore(credits): dev notice-state fixtures (HERMES_DEV_CREDITS_FIXTURE)

Throwaway dev scaffolding to exercise the notice pipeline without real spend or
Redis seeding. Set HERMES_DEV_CREDITS_FIXTURE to a state name (healthy / sub_90pct
/ grant_exhausted / depleted / clear) or a file path whose contents name a state
(re-read each turn → flip states live for recovery testing). _capture_credits
injects the chosen CreditsState instead of parsing real headers and runs the
shared notice policy. Deletable with the rest of the HERMES_DEV_CREDITS scaffolding.

* feat(credits): /usage monthly-grant % gauge

The portal /api/oauth/account subscription block now carries monthly_credits
(the per-period grant allowance, the % denominator). The consumer parsed
monthly_charge but dropped monthly_credits, so /usage stayed magnitudes-only.

Capture monthly_credits into NousPortalSubscriptionInfo + _subscription_from_payload.
build_nous_credits_snapshot emits a Subscription usage window (real % used, routed
through the existing render machinery) when monthly_credits is a finite positive
denominator and credits_remaining is finite and <= cap; otherwise it degrades to
magnitudes-only (older portals, rollover-over-cap, or non-finite payloads).

Guards (adversarial-review-driven): reject non-finite operands (json.loads parses
bare NaN/Infinity by default → would render $nan + a false 100% used), reject
bools, guard div-by-zero (cap>0), and suppress the gauge when remaining > cap
(rollover spanning the period makes the cap a nonsensical denominator → the
$X-of-$Y detail would read as a contradiction). Debt (remaining<0) clamps to 100%.

Money rule preserved: the ratio + magnitudes are computed from numeric float
account fields via display formatting, never by parsing a server *_usd string
(there are none on these dataclasses).

13 gauge tests added (tests/agent/test_nous_credits_gauge.py).

* fix(credits): show /usage Nous block whenever a Nous account is present

/usage runs in a slash-worker subprocess whose resolved inference provider is
often not "nous" even when the user has a Nous account, so gating the Nous
credits block on (provider == "nous") hid it entirely — the account data was
fully available but never rendered.

Gate instead on "a Nous account is logged in": a cheap local auth-state lookup
(get_provider_auth_state('nous') has an access_token) decides whether to attempt
the portal fetch, regardless of which provider inference runs on. In the gateway
the block is also lifted out of the 'if provider:' scope so a Nous-credentialled
user with another (or no) resident inference provider still sees their balance.
Fail-open and the per-fetch wall-clock timeout are preserved.

* fix(credits): show /usage Nous block when there's no live agent (TUI slash-worker)

In the TUI, /usage runs in a slash-worker subprocess that resumes the session
WITHOUT building an agent (self.agent is None), so _show_usage early-returned
"(._.) No active agent" before ever reaching the Nous credits block — which is
agent-independent (a portal fetch gated on Nous auth-state). Extract the block
into _print_nous_credits_block() and run it at the no-agent / no-calls
early-returns too (returns True if it printed, so the fallback message only
shows when there's genuinely nothing).

Verified live against staging: the block + monthly-grant gauge now render in the
slash-worker /usage path (previously hidden). The plain CLI REPL + messaging
paths are unchanged (they have a live agent).

* feat(credits): escalating 50/75/90 usage bands (single status line)

Replace the lone 90%-used warning with three escalating bands (50 info, 75 warn,
90 warn) shown as ONE status-bar line: it displays the highest band the
subscription grant has crossed, replaces the line as usage climbs, steps back
down on recovery, and clears below 50%. No stacking, no per-turn churn.

Bands live in a tunable CREDITS_USAGE_BANDS list; the policy derives everything
from it. Single notice key (credits.usage) with a usage_band latch field so the
notice only re-emits when the band actually changes. The crossing gate
(seen_below_90) is preserved so a fresh live session that opens mid-range stays
quiet until it has been observed below the lowest band (cold-start primes it when
it wants an open-high warning). Denominator math unchanged: % = subscription
grant burn (cap - grant_remaining)/cap, clamped [0,1]; top-up never moves the %.

Migrated test_credits_policy.py to the new key + added TestUsageBands (climb,
step-down, recovery-clear, idempotent, inclusive boundaries).

* feat(credits): hydrate notices at session OPEN via shared seed (TUI + first-turn)

Notices previously only fired inside a conversation turn (first message), so a
session that opened already depleted / past a usage band showed nothing at
'ready'. Extract the cold-start seed into a shared seed_credits_at_session_start()
and call it (a) in the TUI/desktop agent build right after the notice callback is
wired (fires at 'ready', before any message) and (b) as the first-turn fallback in
conversation_loop. Idempotent (skips once _credits_state exists) and fail-open.

The seed now maps monthly_credits -> subscription_limit_micros +
denominator_kind='subscription_cap', so used_fraction is computable at seed time
and usage-band warnings (not just depletion) hydrate on open. Primes the crossing
latch so a session opening already in a band warns immediately. Degrades to
depletion-only when monthly_credits is absent (older portals).

Adds test_credits_cold_start.py covering open-at-band, depletion, debt, no-cap
degradation, and the shared seed (fires/idempotent/skips-non-nous).

* feat(credits): /usage monthly-grant % gauge + fixture support + TUI surfacing

agent/account_usage.py: build_nous_credits_snapshot emits a subscription %% gauge
when the portal supplies a positive, finite monthly_credits denominator with
remaining <= cap (guards reject NaN/Infinity and rollover-over-cap, which would
render $nan or a contradictory $X-of-$Y); degrades to magnitudes-only otherwise.
Adds shared nous_credits_lines() (auth-gated, wall-clock-bounded portal fetch) so
the CLI and TUI /usage render the same block, and _snapshot_from_credits_state()
so HERMES_DEV_CREDITS_FIXTURE drives /usage offline too.

TUI: session.usage RPC carries credits_lines (agent-independent) and the /usage
panel renders them regardless of API-call count or resume state — previously the
TUI's separate /usage implementation only showed token counts.

Money rule preserved: %% and magnitudes come from numeric float account fields via
display formatting, never by parsing a server *_usd string.

* feat(credits): CLI REPL inline notices (parity with TUI)

The plain CLI agent bound no notice callbacks, so credit notices were TUI-only.
Bind notice_callback/notice_clear_callback on the CLI AIAgent; _on_notice renders
a single level-colored line above the prompt (error red / warn yellow / success
green / info dim) via _cprint, and seed credits at session open so a depletion or
usage-band warning shows before the first message — the same hydration the TUI
got. _on_notice_clear is a no-op (the REPL prints lines, no persistent slot).

* test(credits): add sub_50pct + sub_75pct dev fixtures for the new usage bands

The fixture set jumped 10%% -> 90%%; add sub_50pct (uf 0.5 -> band 50 info) and
sub_75pct (uf 0.75 -> band 75 warn) so the new escalating bands are exercisable
via HERMES_DEV_CREDITS_FIXTURE across all three surfaces (notice, session-open
seed, /usage gauge).

* fix(credits): usage-band notice clears on next prompt (not sticky-forever)

A 50/75/90 usage heads-up was sticky and camped the status bar indefinitely. Clear
the visible credits.usage notice when a new turn starts (startMessage), so it shows
until your next prompt then yields. The server latch is unchanged, so it won't
re-nag at the same band — it only re-shows when the band actually changes (climb)
or clears when usage drops below the lowest band. Depletion stays sticky.

* refactor(credits): consolidate the /usage credits block behind nous_credits_lines()

The CLI (_print_nous_credits_block) and the messaging gateway (_handle_usage_command)
each re-implemented the auth-gate + portal fetch + render, and both bypassed the
dev-fixture short-circuit that only the TUI honored — so /usage ignored
HERMES_DEV_CREDITS_FIXTURE on the CLI and in chat. Route both through the shared
agent.account_usage.nous_credits_lines() helper: one fetch/render path, one auth
gate, and the fixture works on every surface (~60 fewer duplicated lines).

The gateway usage test recorded only the last asyncio.to_thread call; /usage now
dispatches both the account fetch and the credits fetch, so it records every call
and matches the account fetch by its provider arg.

* fix(credits): keep the /usage gauge type-safe and log its fail-open path

_is_finite_num is now a TypeGuard[float], so the type checker narrows the gauge
operands (monthly_credits / credits_remaining) and the magnitudes passed to
_fmt_usd through it — no more None-operand warnings on the arithmetic. Add a debug
breadcrumb on the nous_credits_lines portal-fetch fail-open so a dead /usage block
is diagnosable in agent.log without a dev flag.

* fix(credits): harden the header tracker — prod-leak gate, hot-path probe, fire-and-forget seed

- Prod-leak guard: dev fixtures (HERMES_DEV_CREDITS_FIXTURE) now also require
  HERMES_DEV_CREDITS, so a stray fixture var can't surface fabricated balances on a
  real account. Matches the documented run workflow (both vars set together).
- Hot-path probe: parse_credits_headers checks for the version sentinel header
  before allocating a lowercased copy of the response headers — skips that work on
  every non-Nous API call. Behaviour-identical and still case-insensitive.
- Fire-and-forget seed: the real portal fetch in seed_credits_at_session_start now
  runs in a daemon thread, so a slow/unreachable portal never delays session "ready"
  (previously blocked up to 10s). The dev-fixture path stays synchronous; the thread
  re-checks idempotency before hydrating (a live header may land first).
- Diagnostics: debug breadcrumbs on the parse and seed fail-open paths so a crashed
  parser / dead seed is distinguishable from a legitimate no-headers miss.

Cold-start tests set HERMES_DEV_CREDITS alongside the fixture to match the gate.

* test(tui): fix env-timing in the StatusRule dev-credits assertion

DEV_CREDITS_MODE is read once at module load (config/env), so mutating
process.env.HERMES_DEV_CREDITS inside the test couldn't flip it — the dev-banner
assertion only passed if the env was exported before vitest started, and failed in a
normal run. Move that assertion to a sibling file that mocks config/env with
DEV_CREDITS_MODE: true (scoped, no module-reset / React-identity hazard).

* test(credits): cover the dev-fixture /usage render and usage-band clear-on-prompt

- _snapshot_from_credits_state (the offline /usage renderer) had no direct test:
  lock the gauge math, the verbatim *_usd magnitudes, the depletion line and the
  fixture marker, plus the no-cap (no gauge) and None-state cases.
- turnController.startMessage had no test for clearing the credits.usage notice on
  the next prompt while leaving credits.depleted sticky.

* feat(credits): deliver credit notices over messaging gateways

Bind notice_callback/notice_clear_callback on the per-turn gateway agent
so usage-band / depletion / restored notices reach Telegram/Discord/Slack/
etc. Previously the messaging gateway bound neither callback, so the agent's
_emit_credits_notices early-returned and a chat user crossing a band got
nothing unless they ran /usage manually.

- render_notice_line(): AgentNotice -> single plaintext line (level glyph +
  text), plaintext-only so it renders uniformly without per-platform escaping.
  Fail-soft on malformed/empty notices.
- Standalone push for every notice (messaging has no persistent status bar):
  route through the shared _deliver_platform_notice rail (honors private/
  public delivery + thread metadata), scheduled onto the gateway loop via
  safe_schedule_threadsafe from the agent's sync worker thread — same pattern
  as _status_callback_sync.
- The fired-once latch lives on the cached (reused-in-place) agent and
  persists across turns, so a band crosses once -> one push, no per-turn
  re-nag. Re-fires only after idle-eviction rebuilds the agent (a reminder).
- Recovery ('Credit access restored') rides the show path (emitted as a
  success notice, not a clear). notice_clear_callback is a no-op: a sent
  platform message can't be cleanly retracted.

Tests: render glyph/levels/fail-soft + public/private delivery seam through
_deliver_platform_notice + no-adapter no-op.

* fix(credits): don't double the glyph on messaging notices

render_notice_line prepended a per-level glyph, but the notice policy already
bakes the glyph into the text (and the TUI + CLI render it verbatim) — so every
credit notice over messaging came out doubled ("⚠ ⚠ Credits 90% used",
" ✕ Credit access paused"). Emit the text verbatim instead; drop the now-dead
level→glyph map.

The render tests fed glyph-less text (and the success case only checked
startswith), so the doubling slipped through. Rework them around the verbatim
contract and add an end-to-end regression that runs real evaluate_credits_notices
output through render_notice_line and asserts the line is returned unchanged.
2026-06-06 13:18:18 +05:30
brooklyn!
e375c33f70
fix(tui): clean force-send of queued messages (#40235)
Force-sending a queued message (double-empty-enter, or interrupt-mode
submit) flipped busy→false optimistically, so the queue drain raced the
still-unwinding turn: duplicate user bubble, a stray "queued: …" note, and
the cancelled turn's "Operation interrupted…" reply leaking in.

interruptTurn gains `keepBusy`: hold busy until the gateway's real settle
edge (message.complete, suppressed while interrupted), which drains the
queued message exactly once — desktop "send now" parity. The interrupt
paths now queue + interrupt instead of optimistically sending.
2026-06-06 01:39:10 +00:00
Ben
825629424d fix(tui): persist timed-out/cancelled clarify prompts in transcript
When a clarify prompt times out (backend _block returns an empty answer
after the configured timeout) or is dismissed with Esc/Ctrl+C, the live
ClarifyPrompt overlay was torn down by turnController.idle() ->
resetFlowOverlays() with no persistent transcript record. The question and
options vanished from the screen while the agent's follow-up still referred
to "the options above".

The answered path already persists the question + answer; only the
unanswered exits left no trace. This asymmetry is the bug.

Fix (TUI layer only, no Python/protocol change):
- formatAbandonedClarify() in lib/text.ts renders the question + the same
  1-based numbered option list shown by ClarifyPrompt, plus a reason
  ('timed out' / 'cancelled').
- Timeout: createGatewayEventHandler flushes a still-live clarify into the
  transcript as a plain system line when the clarify tool's own tool.complete
  fires. A live capture of the event stream confirmed this is the only point
  where the overlay is still set after a timeout: the sequence is
  clarify.request -> (timeout) -> tool.complete -> message.complete, with NO
  intervening message.start/tool.start. On a real answer, answerClarify()
  clears the overlay before tool.complete arrives, so the hook no-ops there
  (no double-write); a per-requestId guard set is belt-and-braces.
- Explicit cancel: answerClarify('') persists the prompt as a system line
  instead of a transient 'prompt cancelled' flash.

System lines always render (unlike trail lines, which /details can hide),
so the record reliably survives on screen as standard output.

Verified live in the TUI: an Esc-cancelled clarify now leaves the question +
options + '(cancelled - no selection)' in the transcript after the turn ends.

Tests: formatAbandonedClarify unit cases + gateway-handler behavioral cases
(persist on clarify tool.complete, no flush on a non-clarify tool.complete,
no double-persist on repeat tool.complete, no-op when the overlay was already
cleared by an answer).
2026-06-04 16:25:54 -07:00
rexdotsh
98903d0313 fix(tui): reuse live session on resume 2026-06-04 08:18:26 -07:00
Brooklyn Nicholson
725290db63 test(hermes-ink): fuzz the tokenizer flush valve against fragment leaks
Hammer createTokenizer with the worst stalls a terminal can produce —
split + flush at every interior byte, and a 200-report byte-by-byte feed
that flushes after every single byte — and assert the two invariants that
make the SGR-leak class structurally impossible: nothing ever leaks as a
text token, and every complete report reassembles whole. A mixed
mouse+keystroke variant proves real input survives the same storm.
2026-06-03 19:38:08 -05:00
Brooklyn Nicholson
6efc7eda57 refactor(hermes-ink): delete now-dead SGR mouse fragment recovery
With the tokenizer reassembling split CSI sequences across a flush (prior
commit), no SGR mouse fragment can reach a text token anymore — terminals
write a mouse report as one atomic sequence, and any read/flush split now
re-joins in the tokenizer buffer instead of leaking. That makes the whole
downstream recovery layer dead code:

- SGR_MOUSE_FRAGMENT_RE, MOUSE_BURST_NOISE_RE, MOUSE_BURST_RESIDUE_RE
- parseTextWithSgrMouseFragments / parseSgrMouseFragment /
  normalizeSgrMouseFragment
- the whole-text mouse-burst noise fast path in parseMultipleKeypresses

Remove all of it (~185 lines) and the tests that only exercised it. The
narrow legacy X10 wheel-tail resynth stays (distinct mechanism, kept with
its own test). This retires the #17701#18113#26781#28463#35512
regex hardening chain in favor of the one correct parser fix.
2026-06-03 19:29:42 -05:00
Brooklyn Nicholson
de124800a2 test(hermes-ink): drop input-event SGR guard test
The guard it covered was removed in the previous commit (fragments no
longer reach input-event — they reassemble at the tokenizer). Reassembly
is now covered by termio/tokenize.test.ts and the flush-boundary cases in
parse-keypress.test.ts.
2026-06-03 19:24:51 -05:00
Brooklyn Nicholson
f354323547 fix(hermes-ink): reassemble split mouse sequences at the tokenizer; drop the regex sink
Root-cause fix for the SGR mouse fragment leak (`46M35;40M...` typed into
the prompt). The leak was never really about the fragments — it was the
flush emitting them. When App's 50ms watchdog fires mid-CSI during a render
stall, the tokenizer was force-emitting the buffered partial as a token and
resetting to ground, so both the prefix and the ESC-less remainder surfaced
as unparseable input.

Make the flush state-aware (xterm.js discipline): a bare ESC still flushes
to the Escape key (the legitimate ESCDELAY case), but a buffer still inside
a multi-byte control sequence (csi/osc/dcs/apc/ss3/intermediate) is NOT
emitted — it's kept so the continuation reassembles on the next feed. A
one-tick truncation valve in createTokenizer.flush() drops a partial that
survives a second flush with no progress, so a genuinely truncated write
can't fuse into the next keypress.

With partials never entering the input stream, the downstream scrubber is
dead code: remove the SGR fragment guard from input-event.ts (both the
original `/^\[<\d+;\d+;\d+[Mm]/` and the consolidated form added earlier in
this PR). The parse-keypress burst-recovery regexes (MOUSE_BURST_*) are now
also redundant but left in place as a safety net for one release; they can
be removed in a follow-up once this soaks.

Tests: tokenize.test.ts proves a mid-CSI flush keeps/reassembles and that a
stale partial is dropped after a second flush and a bare ESC still emits;
parse-keypress.test.ts adds the end-to-end split-then-reassemble case
yielding a single clean mouse event with no leaked key.

Supersedes #29337.
2026-06-03 19:24:28 -05:00
Brooklyn Nicholson
01c010e233 fix(hermes-ink): collapse SGR mouse fragment guards into one flush-aware rule
When App's 50ms flush watchdog fires mid-CSI during a render stall, an
SGR mouse report (ESC[<btn;col;row M/m) is split across stdin chunks: the
tokenizer force-emits the buffered prefix and resets to ground, so both
the prefix and the ESC-less remainder reach InputEvent as nameless tokens.

The previous guard only matched a full `[<\d+;\d+;\d+[Mm]` fragment, so
the flushed prefixes (`ESC[<0;35;`) and the 1-/2-field and leading-`;`
tails (`46M`, `35;46M`, `;46M`) still leaked into the composer as
`46M35;40M...` during long sessions.

Replace the three would-be narrow regexes with one consolidated rule that
covers every split position. A `(?=...\d)` lookahead keeps typed `<`, `[`,
`;`, and `M` safe (no coordinate digit), and the embedded M/m terminator
in the param class leaves stuck-together fragments / prose intact. The
existing `!keypress.name` gate continues to protect real keystrokes, which
arrive one char per chunk with a name set.

Supersedes #29337 (covers the prefix-leak and leading-`;`/1-/2-field tail
cases that PR's two added guards missed).
2026-06-03 19:05:26 -05:00
teknium1
e76d8bf5aa
fix(tui): stop persisting full tool output in trail lines (silent OOM death)
A heavy --tui session (browser snapshots, large tool outputs) silently
OOM-killed the Node parent within minutes — closing the gateway child's
stdin, which the user saw only as a bare "gateway exited" / stdin EOF.
CLI was immune. Root cause: each completed tool's verbose trail line
embedded up to 16KB of result_text, persisted in transcript Msg.tools[]
for the whole session and rendered EXPANDED by default, so an Ink
render-node tree was built for every one of up to 800 messages at once.
That tree blew past Node's heap at a few hundred MB — far below the 2.5GB
memory-monitor exit threshold, so the death was never even attributed.

- text.ts: persisted verbose tool-trail blocks now cap to a small preview
  (VERBOSE_TRAIL_MAX_CHARS=800/12 lines), not the 16KB live-render budget.
  Retained trail strings drop ~17x (12.2MB -> 0.7MB at 800 msgs); the live
  streaming tail still uses the larger LIVE_RENDER budget.
- tui_gateway/server.py: lower the gateway-side verbose text cap to match
  (1KB/16 lines) so we stop shipping output the TUI no longer renders.
- memoryMonitor.ts: derive critical/high thresholds from the real V8 heap
  ceiling (~88%/70%) instead of the hardcoded 2.5GB that killed the process
  at 31% of an 8GB ceiling; add a one-shot onWarn early-warning on fast
  sub-threshold heap growth so the next such death is diagnosable, not silent.
- entry.tsx: wire onWarn to a crash-log breadcrumb + stderr line.

Full tool output is unchanged in the agent context and SQLite session — this
is display/transport only, no behavior or context change.

Fixes #34095. Related #27282.

Tests: ui-tui text + new memoryMonitor suites (33 pass), python verbose-cap
guard (5 pass); full ui-tui suite shows no new failures vs pristine main.
E2E repro confirms the retention drop.
2026-06-03 06:00:22 -07:00
Brooklyn Nicholson
dfba3f3e51 fix(tui): clear selection on right-click copy + group transcript blocks
Two TUI polish fixes.

(1) Right-click copy now clears the highlight.
The right-click handler copied an active selection via onCopySelectionNoClear
(the copy-on-select variant that keeps the highlight during a drag) and never
cleared it, so after right-click-to-copy the selection stayed lit with no
confirmation and a follow-up right-click re-copied the stale range instead of
pasting. A successful right-click copy now clears the selection and notifies;
if the copy fails (no clipboard path) the highlight survives and we fall back
to the right-click paste handler, exactly as before.

(2) Group transcript blocks so boundaries read clearly.
Model replies, reasoning/tool trails, and system/error notes rendered with no
vertical separation, so distinct block types butted together and were hard to
scan. Group adjacent blocks by kind: one blank line opens only where the visual
group changes (model prose <-> reasoning/tool trails <-> notes), while a run of
same-kind blocks renders flush. The rule lives in domain/blockLayout.ts
(messageGroup + hasLeadGap) and is applied intrinsically in MessageLine via a
`prev` prop, which fixes the things ad-hoc per-block margins kept breaking:

  - Streaming stability: the gap is derived from the stable predecessor, never
    the live block's own changing text, so the actively-streaming reply computes
    the same gap while it streams as the settled segment does once it flushes.
    No reflow/jump.
  - Transparent empty trails: a trail hidden by /details, or one carrying only a
    token tally (the finalDetails segment message.complete appends), renders
    nothing and is transparent to grouping (prevRenderedMsg skips it), so there
    are no floating gaps, no doubled gap after a prompt, and no padded space
    above the final reply. In the default/collapsed modes content-bearing trails
    always render, so the grouping is a no-op there.

The virtual-height estimator counts the group-boundary line so scroll math
stays accurate before Yoga remeasures.

ui-tui/src/domain/blockLayout.ts (new), components/messageLine.tsx,
components/streamingAssistant.tsx, components/appLayout.tsx,
lib/virtualHeights.ts, app/useMainApp.ts.

Tests: blockLayout.test.ts (grouping + hidden/empty-trail visibility),
virtualHeights leadGap, app-mouse.test.ts copy behavior. Full ui-tui suite
green apart from 3 pre-existing local/env failures (cursorDrift, ink-resize,
virtualHeights user-prompt-width) unchanged from main.
2026-06-02 22:03:38 -05:00
ethernet
a51a7b9b92 fix(node/nix): consolidate workspace lockfile + update all consumers
Consolidate per-package package-lock.json files into a single root-level
workspace lockfile.  Update all consumers:

- Nix: shared src/npmDeps/npmDepsHash in lib.nix; devshell hook stamps
  package.json paths then runs npm ci from root; individual .nix files
  use mkNpmPassthru attrs instead of per-package fetchNpmDeps.
- Python CLI: new _workspace_root() helper so _tui_need_npm_install,
  _make_tui_argv, _build_web_ui resolve lockfile/node_modules from the
  workspace root.
- Desktop: replace --force-build/mtime heuristic with content-hash build
  stamp (_compute_desktop_content_hash via pathspec).  Remove --force-build
  flag.
- Dockerfile: single root npm install; no per-directory lockfile copies.
- CI: nix-lockfile-fix and osv-scanner reference root package-lock.json;
  apps/dashboard → apps/desktop.
- Tests: new test_tui_npm_install.py; desktop stamp tests in
  test_gui_command.py; updated assertions in test_cmd_update.py,
  test_web_ui_build.py, test_dockerfile_pid1_reaping.py.
- Docs: remove --force-build from desktop flag table.

Deleted: apps/desktop/package-lock.json, ui-tui/package-lock.json,
ui-tui/packages/hermes-ink/package-lock.json, web/package-lock.json.
2026-06-02 20:28:18 -04:00
brooklyn!
fabca0bdd8
feat(tui): single /model command + unified Sessions overlay (#37112)
* feat(tui): single /model command + unified Sessions overlay

Collapse the redundant `/provider` alias so `/model` is the only name
everywhere (it already drove the same 2-step ModelPicker in the TUI).

Merge the separate `/resume` (cold history browser) and `/sessions` (live
switcher) surfaces into one Sessions overlay reached by `/resume`,
`/sessions`, `/session`, and `/switch`. It pins a "+ new" row at the top
(always visible), lists live sessions with status, and lists resumable
history below — dispatching session.activate for live rows vs resume for
cold ones, with close/delete in place. Fixes `/session` opening an empty
live-only switcher and the hidden new-session affordance.

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* fix(tui): address Copilot review on the Sessions overlay

- Track the armed history-delete by session id instead of row index so the
  1.5s live-status poll re-indexing rows can't redirect the second `d` to a
  different session.
- Re-add the busy-session guard to immediate `/resume <id>` and `/sessions new`
  actions (browsing the bare overlay stays allowed) so resuming/switching can't
  corrupt an in-flight turn's streaming/busy state.

* fix(tui): guard cold-resume (not live-switch/new) from the Sessions overlay

Copilot flagged that overlay actions bypassed the busy guard. Only cold
resume actually closes the current session, so only it is guarded — both
from the slash path and now from the overlay (appActions.resumeById).
Switching between live sessions and starting a `+ new` live session keep
the current session running in the background, so they stay unguarded:
that concurrency is the orchestrator's whole purpose. Also dropped the
over-broad guard on `/sessions new` for the same reason.

* fix(tui): address Copilot review (history dedup + desktop /provider)

- The 1.5s poll now re-derives the resumable list from the RAW session.list
  results (rawHistoryRef) against the current live set, so a session hidden
  while live reappears in history once it closes — instead of being lost
  until a full reload. Delete also prunes the raw ref.
- Drop the dead `/provider` entry from the desktop PICKER_OWNED_COMMANDS now
  that the alias is gone, so the desktop client no longer advertises it.

* fix(tui): surface session.list errors + keep selection stable across polls

- A garbled session.list response now surfaces an error and preserves the
  last good raw history, instead of silently blanking the resumable section.
- The 1.5s poll re-anchors the selection to the same row by session id
  (live or history) when the live list grows/shrinks, so the highlight no
  longer drifts to a different row mid-interaction.

* fix(tui): degrade session.list independently + cover overlay helpers

- Fetch active_list and session.list via Promise.allSettled so a failing
  session.list no longer rejects the whole load: live sessions still render
  and only the resumable history degrades (with an error).
- Add unit tests for the new helpers (sessionRowKindAt row ordering,
  resumableHistory dedupe, sessionsCountLabel, relativeSessionAge).

* test(tui-gateway): assert /provider alias is gone, /model remains

The CI test_complete_slash_includes_provider_alias asserted the removed
`/provider` alias still autocompleted. Flip it to lock in the removal:
`/pro` no longer offers `provider`, and `/mod` still completes `model`.

---------

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-06-01 22:28:36 -04:00
Brooklyn Nicholson
13a2350c8d fix(tui): pass indicatorStyle into FaceTicker so render matches reservation
FaceTicker now takes the indicator style as a prop (same value used by
busyIndicatorWidth) instead of reading the store independently, so the
rendered busy indicator and its reserved width can't desync on /indicator
changes.
2026-06-01 21:02:32 -05:00
Brooklyn Nicholson
899e8b9067 fix(tui): keep fmtCwdBranch default, cap cwd at the status-bar call site
Reverts the shared fmtCwdBranch default (28 → 40) so it isn't an API/
behavior change for other callers, and instead passes max=28 explicitly
from the status-bar caller where the tighter cap is intended.
2026-06-01 20:55:14 -05:00
Brooklyn Nicholson
e25b2a6e18 fix(tui): address Copilot review on status-bar tail disclosure
- Render SpawnHud last in the tail so its un-budgeted (dynamic) width can
  only truncate itself, never push budgeted segments past leftWidth.
- Precompute kaomoji/emoji frame widths once at module load instead of
  rescanning FACES/EMOJI_FRAMES on every status render.
- Correct the tail-priority comment to match the actual fits() order
  (bar, duration, compressions, voice, session count, bg, cost).
2026-06-01 20:49:51 -05:00
Brooklyn Nicholson
9cb7d40d8d fix(tui): derive busy/duration reservation width from fmtDuration
fmtDuration renders a space between units (e.g. `59m 59s`), so the flat
6-col reservation under-counted and could let the elapsed-time tail shove
the model off-screen / break the whole-segment budget. Reserve the bounded
clock width from fmtDuration itself (MAX_DURATION_WIDTH) in both the busy
indicator reservation and the tail duration budget.
2026-06-01 20:42:04 -05:00
Brooklyn Nicholson
2f171743b7 fix(tui): pin status/model, whole-segment tail disclosure, smaller cwd
The previous reservation set the left box width but everything still
shared one flex row, so the lower-priority tail + cwd could still shrink
`ready`/model down to fragments ("re"). Pin the essentials (indicator +
model + context) in a non-shrinking group, and render the tail segments
(bar, duration, compressions, voice, session count, bg, cost) only when
the whole segment fits in the leftover space — in priority order — so
nothing truncates mid-segment and the low-value tail drops first.

Also shrink the cwd/branch label (max 40 → 28) so it stops dominating the
bar on roomy-but-not-huge terminals.
2026-06-01 20:32:27 -05:00
Brooklyn Nicholson
1d7a1c00b4 fix(tui): make busy status-bar reservation /indicator-style aware
The left-content reservation used a flat constant for the busy face,
but its width varies by /indicator style: kaomoji is a wide glyph plus
a rotating verb, while unicode is a bare 1-col braille spinner with no
verb. Reserve the real width via busyIndicatorWidth(style, hasDuration)
so the model stays on-screen across styles without over-reserving the
unbounded elapsed-time tail.
2026-06-01 20:28:43 -05:00
Brooklyn Nicholson
e59b815c04 fix(tui): prioritize status/model over cwd in the status bar on narrow terminals
The status rule reserved only 8 cols for the left segments, so the
cwd + git-branch label on the right could grow until the loading
indicator, model, and context read-out were crushed to almost nothing
(sometimes collapsing to a single illegible line) on small screens.

Reverse the priority: `statusRuleWidths` now reserves the display width
of the must-keep left content (status indicator + model + context) so
the cwd/branch segment truncates first. Add `statusBarSegments(cols)`
progressive disclosure — as the terminal narrows the low-priority tail
sheds in order (cost → bg → voice → compressions → duration → context
bar), and below the bar breakpoint the context read-out collapses to a
bare token count. Status and model are always guaranteed room.

Default `minLeftContent = 0` keeps `statusRuleWidths` byte-identical for
existing callers.
2026-06-01 20:26:41 -05:00
kshitijk4poor
7527e7aeac feat: fuzzy search for the model picker (WebUI + TUI)
Adds fuzzy subsequence matching with quality ranking to the model
pickers, replacing the WebUI's exact-substring filter and giving the
TUI a search where it previously had none.

- New fuzzy scorer (ui-tui/src/lib/fuzzy.ts + an identical copy at
  web/src/lib/fuzzy.ts, since the two are separate TS packages with no
  shared module). Matches a query as an ordered subsequence (so `g4o`
  matches `gpt-4o`), scores by quality (exact > prefix > word-boundary >
  contiguous > scattered) and returns matched character positions for
  highlighting. Multi-token AND semantics (`clad snnt` -> claude-sonnet).
  15 vitest tests cover the algorithm.

- WebUI ModelPickerDialog: ranked fuzzy filter on providers + models;
  matched characters in model rows are highlighted via <mark>.

- TUI modelPicker: type-to-filter on the provider and model stages with
  live ranking. Backspace edits the filter, Ctrl+U clears it, Esc clears
  a non-empty filter before navigating back. Persist-global / disconnect
  shortcuts moved from g/d to Ctrl+G / Ctrl+D so letters feed the filter.

Closes #30849
2026-06-01 16:58:58 -07:00
Teknium
3f7d1c801d feat(undo): /undo [N] backs up N user turns with prefill + soft-delete
Extends the existing /undo command from a single in-memory exchange
removal into a full rewind: back up N user turns (default 1), soft-delete
the truncated rows in SessionDB (active=0, kept for audit, hidden from
re-prompts and search), notify memory providers, and prefill the composer
with the backed-up message text for editing — CLI and TUI.

Reuses the SessionDB rewind primitives, the on_session_switch(rewound=True)
memory hook, and the TUI command.dispatch prefill payload from SaguaroDev's
#21910 work, wired to /undo [N] instead of a separate /rewind picker.

- cli.py: undo_last(n, prefill) — in-memory truncate + SQLite soft-delete
  + agent surgery (system-prompt invalidate, flush-index reset) + memory
  notify + editable buffer prefill; /undo dispatch parses optional count;
  checkpoint-rollback caller passes prefill=False
- tui_gateway/server.py: command.dispatch undo branch (was rewind) parses
  count, picks Nth-from-last user turn, clamps to oldest
- commands.py: /undo gains [N] args_hint
- tests: rename + expand TUI suite (multi-turn, clamp, invalid-count)
- release.py: AUTHOR_MAP entry for SaguaroDev

Co-authored-by: SaguaroDev <74339271+SaguaroDev@users.noreply.github.com>
2026-06-01 01:22:38 -07:00
SaguaroDev
243e836dce feat(tui): wire /rewind through command.dispatch + prefill payload (#21910)
Adds the TUI half of the /rewind feature so the Ink terminal UI gets
the same affordance as the prompt_toolkit CLI.

Python side (tui_gateway/server.py):
- /rewind added to _PENDING_INPUT_COMMANDS so slash.exec rejects it
  and the TUI falls through to command.dispatch (the only path with
  access to live session state + memory hooks).
- New command.dispatch branch for name == "rewind":
  v1 auto-picks the most recent user turn (Claude-Code-style single-
  step undo), calls SessionDB.rewind_to_message, refreshes the
  in-memory history, fires _memory_manager.on_session_switch with
  rewound=True, and returns the new "prefill" payload.
- A dedicated picker overlay (multi-step rewind) is tracked as a
  follow-up to #21910.

TS side (ui-tui/src/):
- New "prefill" variant on CommandDispatchResponse + asCommandDispatch
  validator. Mirrors "send" but does NOT auto-submit; the client drops
  the message into the composer for editing.
- createSlashHandler renders the optional notice via sys() and calls
  ctx.composer.setInput(d.message), letting the user edit-and-resubmit
  the rewound turn — the core UX promised by the issue.

Tests:
- 7 new tui_gateway tests covering prefill payload shape, in-memory
  history truncation, DB soft-delete, memory-provider notification
  (rewound=True), busy-session refusal, missing-session error, and
  registry placement in _PENDING_INPUT_COMMANDS.
- Extended asCommandDispatch vitest covering the new prefill variant
  (with + without notice, and rejection of malformed payloads).

Out of scope for v1 (tracked as #21910 follow-up):
- Dedicated picker overlay in Ink (the multi-step rewind UI). v1 auto-
  picks the most recent user turn, matching the most common case.
- Gateway platforms (Telegram, Discord, etc.) — issue scopes v1 to
  CLI + TUI only.
2026-06-01 01:22:38 -07:00
Teknium
cd8aa389c9
Revert "fix(tui): clamp bogus terminal dimensions (WSL 131072x1) (#35657)" (#36096)
This reverts commit b1d34cf6e2.
2026-05-31 15:51:11 -07:00