hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-21 10:22:18 +00:00

Author	SHA1	Message	Date
Alex Yates	fad4b40d9d	fix(model): persist /model switch by default across sessions A plain /model <name> switch only lasted for the current session — every new session reverted to the previously-configured model, so users had to re-switch every time (e.g. glm-5.1 -> glm-5.2 on every launch). Persist-by-default is now the behavior across all three /model surfaces (CLI, gateway, TUI/dashboard), gated by a new config key model.persist_switch_by_default (default true): /model <name> switch model (persists to config.yaml) /model <name> --session switch for this session only /model <name> --global switch and persist (explicit, unchanged) The effective persistence is resolved once via resolve_persist_behavior() in hermes_cli/model_switch.py so --session opts out, --global opts in, and the config-gated default applies otherwise. --global remains a valid explicit no-op alias for the new default.	2026-06-19 07:07:06 -07:00
OYLFLMH	c1ffd4c3b4	fix(cli): make refresh_interval configurable, default to 0 (disabled) Commit `6724daa2c` added refresh_interval=1.0 to keep the idle clock ticking, but unconditional 1 Hz redraws in non-fullscreen prompt_toolkit mode cause terminal emulators (Xshell, iTerm2, Windows Terminal) to auto-scroll to the bottom on every tick — breaking scroll-up to read history. Drive it from display.cli_refresh_interval (0 = disabled, the default) so users who want the ticking clock can opt in without affecting everyone. Fixes: #48309 Related: `6724daa2c`, `8972a151a`	2026-06-19 07:06:34 -07:00
JoaoMarcos44	e48554a3e0	feat(cli): lock hermes worktrees so concurrent processes can't clobber them git worktree lock at creation and unlock before removal. A locked worktree refuses 'git worktree remove' (and prune), so a second hermes process or a stray cleanup can't silently delete an in-use isolated worktree. Fail-soft on both paths — a lock/unlock error never blocks the session or cleanup. Salvaged from #47029 (Issue #46303). Unlock moved to the actual-removal path so a preserved (unpushed-commits) worktree stays locked while in use.	2026-06-18 19:15:04 -07:00
teknium1	1d2e359678	fix(cli): surface a visible warning when the session store is unavailable When SessionDB init fails, the CLI/Desktop previously continued live with only a buried log line. The chat looks healthy, but the transcript is never written to state.db — so resume later shows a truncated or empty session and the user only discovers the loss after the fact (#41386). Emit a prominent stderr banner at startup when the store is unavailable, making it explicit that the conversation will not be saved and cannot be resumed, with a pointer to fix the store. Also set _session_db_unavailable so downstream code can detect the degraded state.	2026-06-18 19:14:52 -07:00
islam666	9705e7944a	fix(picker): remove max_models=50 cap in interactive model pickers The interactive model pickers (Desktop REST API, TUI model.options, CLI /model) were hard-capped at max_models=50, which truncated large provider catalogs like Kilo Gateway (336 models) to just 50 entries. This made most models undiscoverable via the picker search box. Changes: - Change build_models_payload() default from max_models=50 to None (unlimited) - Change list_authenticated_providers() default from max_models=8 to None - Change list_picker_providers() default from max_models=8 to None - Fix all [:max_models] slicing to handle None as 'no limit' - Remove max_models=50 from 5 interactive picker callers: * web_server.py: get_model_options (Desktop /api/model/options) * web_server.py: get_recommended_default_model * model_switch.py: prewarm_picker_cache_async * tui_gateway/server.py: model.options JSON-RPC * cli.py: HermesCLI model picker - Telegram/Discord inline keyboard picker (gateway/slash_commands.py) still passes max_models=50 explicitly — unchanged behavior. The total_models field was already in the response payload and is now meaningful since models.length == total_models for interactive pickers. Fixes #48279	2026-06-18 13:47:31 -07:00
kyssta-exe	81ff916e57	fix(agent): flush un-persisted messages before session rotation (#47202 ) compress_context() rotates the session (end_session -> create_session) mid-turn when auto-compress triggers, but never called _flush_messages_to_session_db() first. Messages generated during the current turn that hadn't been persisted to state.db were silently lost. The same bug existed in cli.py:new_session() (/new command). Both paths now flush un-persisted messages before ending the old session.	2026-06-18 13:38:35 -07:00
Siddharth Balyan	73cd8622f9	feat(billing): /billing terminal billing — interactive TUI + CLI client (#45449 ) * feat(billing): nous_billing http client + BillingState core (phase 2b) Phase 2b terminal-billing client foundation: - hermes_cli/nous_billing.py: typed client for the 4 /api/billing/* endpoints (state/charge/poll/auto-top-up). Raises typed errors (BillingScopeRequired, BillingRateLimited, BillingAuthError) mapped from the live-verified contract; fail-open is the caller's job. Idempotency-Key enforced client-side. - agent/billing_view.py: surface-agnostic BillingState core + Decimal money parsing (server emits decimal strings, not 2dp), fail-open builder, idempotency-key gen, custom-amount validation. - 51 unit tests (decimal parse/format, payload tiering, error->exception matrix, fail-open, amount validation). Plan: docs/plans/2026-06-13-001-phase-2b-terminal-billing-tui-plan.md * feat(billing): billing:manage scope + lazy step-up re-auth (phase 2b) - NOUS_BILLING_MANAGE_SCOPE constant. - nous_token_has_billing_scope(): split-based scope check (no false-positive substring match). - step_up_nous_billing_scope(): re-runs the device flow requesting billing:manage, reusing the held credential's portal/inference URLs + client_id (so a preview stays a preview), persists like _login_nous but WITHOUT the model picker. Returns True iff the minted token carries the scope (False when NAS silently downscopes a non-admin / unticked grant). Lazy step-up (plan D-A): normal login path unchanged; 403 insufficient_scope from a billing call triggers this. 7 unit tests. * feat(billing): billing JSON-RPC methods for the TUI (phase 2b) billing.state / charge / charge_status / auto_reload / step_up in tui_gateway/server.py. Return STRUCTURED success envelopes (result.ok + result.error=<code>) rather than JSON-RPC-level errors, so the Ink rpc() promise always resolves and the TUI branches on the typed billing error code (insufficient_scope, rate_limited, no_payment_method, …) to render the right affordance. Money serialized as decimal STRINGS + display strings. charge mints + echoes an idempotency_key for retry reuse. 16 unit tests. * feat(billing): /billing CLI handler + command registry (phase 2b) - CommandDef("billing", subcommands=buy\|auto-reload\|limit), added to _SLACK_VIA_HERMES_ONLY so it routes via /hermes on Slack (keeps the 50-cap parity test green, same as /credits). - cli.py::_show_billing + screen helpers: all 5 screens (overview, buy→confirm→ poll, auto-reload, monthly-limit read-only). Reuses _prompt_text_input_modal / _prompt_text_input (D-C). Non-interactive (_app is None) renders text + portal deep-link, never prompts (R7). Decimal money end-to-end. 2s/5-min cancellable poll loop; 429/503 = retry not failure; settled = ledger truth. Lazy step-up on 403 insufficient_scope. no_payment_method treated as mainline funnel-to-portal. - 6 CLI tests; 156 command tests (incl. Slack/Telegram parity) green. * feat(billing): /billing Ink TUI screens + tests (phase 2b) - ui-tui/src/app/slash/commands/billing.ts: /billing TUI command covering all 5 screens — overview (text), buy <amt> → ConfirmReq → charge → non-blocking 2s/ 5-min poll loop → settled/failed/timeout branches, auto-reload <below> <to> → ConfirmReq → PATCH, limit (read-only). Reuses the existing ConfirmReq overlay (D-C) — no bespoke component. Typed-error envelope branching: insufficient_scope arms the lazy step-up confirm; no_payment_method/rate_limited/cap funnel to portal. Client-side amount validation mirrors the server (bounds + 2dp). - gatewayTypes.ts: Billing* response interfaces. - registry.ts: register billingCommands. - billingCommand.test.ts: 12 vitest cases (overview/gating/buy-confirm-poll- settled/no_payment_method/step-up/limit/auto-reload/validation). TUI build green; 12/12 vitest pass; slash tests pass once @hermes/ink is built. * docs(billing): scrub private cross-repo references NAS is a private repo — remove all references to it from the public PR: - drop the cross-repo planning doc (planning scaffolding, not a deliverable; the PR description documents the design) - replace 'NAS' / 'PR #412 preview' mentions in code + test comments with generic 'the server' / 'a preview deployment' * docs(billing): scrub final NAS reference in step-up docstring * docs(billing): drop dangling plan-doc refs The phase-2b plan doc was removed in the cross-repo scrub (`300afcc0b`) but two module docstrings still pointed at it. Drop the dead refs. * feat(billing): interactive /billing overlay + step-up UX, portal-URL & token fixes Adds the interactive /billing TUI overlay and hardens the terminal-billing client across CLI and TUI. - TUI: full /billing overlay state machine (overview to buy to confirm, auto-reload, read-only monthly limit) reusing the existing confirm overlay. - Step-up: surface the verification link in-transcript and open the browser via the TUI's own opener (the device flow runs in the headless gateway, so a printed URL was being dropped); run the step-up handler off the main loop and emit the link as an out-of-band event so the gateway stays responsive. - Step-up copy is scope-accurate ("Billing permission granted") and re-checks /state so it never claims "enabled" when the org kill-switch is still off. - Portal deep-links resolve to absolute URLs against the active portal base (the server emits them relative) - fixes a bare "/billing?topup=open" link. - Billing calls refresh an expired access token via the stored refresh token instead of reporting a false "not logged in". - Optimistic funnel: advise "set up a saved card on the portal" up front when no card is on file (advisory, not a hard gate). - Token resolution is cached briefly so the 2s charge poll loop stops re-locking + re-reading the auth store on every tick; 401 re-resolves fresh. - Remove the temporary demo-mode shims. Validation: 87 Python billing tests, 88 TS tests (billing command + gateway event handler), tsc clean, ink + ui-tui builds green. * docs(billing): add /billing TUI screenshots for PR * fix(cli): guard _last_invalidate on bare instances; update stale prompt-fallback test The UI-invalidate throttle read self._last_invalidate unconditionally, which raised AttributeError on HermesCLI instances built without __init__ (the thread-safety test's object.__new__ shell). Guard the read with getattr. The off-main-thread branch of _prompt_text_input was changed (#23185) to cancel cleanly to None instead of falling back to a bare input() that would hang on the slash-worker thread; the test still asserted the old direct-input fallback. Update it to assert the current intended behavior: returns None, calls neither run_in_terminal nor input(), and does not hang.	2026-06-19 01:53:32 +05:30
xxxigm	f48b312037	fix(cli): keep typing responsive by not blocking the keystroke loop The interactive CLI input box runs its completer with `complete_while_typing=True`, so `SlashCommandCompleter.get_completions` is invoked on every keystroke. That completer does blocking I/O: fuzzy `@`-file indexing shells out to `rg`/`fd` (up to a 2s timeout) and file-path completion calls `os.listdir` + `stat`. Because the completer was passed inline (never wrapped in `ThreadedCompleter`), all of this ran synchronously on the prompt_toolkit event loop, stalling the render after each key — very noticeable on WSL2 and other slow-filesystem setups ("typing in the prompt box being very latent"). Two fixes: - Wrap the input completer in `ThreadedCompleter` so completion work runs off the UI event loop and never blocks rendering between keystrokes. - Stop treating URLs as file paths in `_extract_path_word`: a token like `https://example.com/x` contains `/`, so it triggered `os.listdir` on every keystroke while typing/pasting a link (listing a bogus `https:` dir) for a completion that can never be useful. Skip any token with a `://` scheme separator. (cherry picked from commit `b5be2ba276`)	2026-06-17 12:32:38 +05:30
Teknium	c66ecf0bc3	feat(delegation): async background subagents via delegate_task(background=true) (#40946 ) * feat(delegation): async background subagents via delegate_task(background=true) delegate_task(background=true) dispatches a subagent that runs in the background and returns a handle immediately, so the user and model keep working while it runs. The full result — plus the original task source — re-enters the conversation as a new turn when the subagent finishes, riding the same completion-queue rail as terminal background processes. - tools/async_delegation.py: daemon-executor registry, capacity cap, rich self-contained completion event pushed onto the shared process_registry.completion_queue (type='async_delegation'). - delegate_tool.py: background param + single-task dispatch branch; batch async rejected (v1). - process_registry.py: format_process_notification renders the rich task-source block (goal/context/toolsets/model/status/result). - gateway/run.py: dedicated _async_delegation_watcher drains + injects results into the originating session (idle + post-turn), session_key routing enrichment, shutdown interrupt of dangling delegations. - config: delegation.max_async_children (default 3). Reuses the existing idle-drain wiring rather than mutating a running agent loop, preserving message-role alternation and prompt-cache invariants. 13 targeted tests; CLI + gateway paths E2E-verified. * test(delegation): make async non-blocking tests environment-independent CI 'test (5)' flaked on a cold, 8-worker runner: the first delegate_task(background=true) call measured 2.27s of one-time setup (config load + child-agent construction + imports), tripping the elapsed < 1.0 wall-clock assertion. That assertion was testing setup overhead, not blocking. Replace the wall-clock thresholds with the real invariant: dispatch returns while the child is still gated (active_count == 1, completion queue empty), which a synchronous impl could not do. Keep only a loose 4s sanity backstop well under the runner's 5s gate. * fix(delegation): harden async background delegation Follow-up review fixes: - Detach background child from parent._active_children at dispatch — otherwise parent-turn interrupts (Ctrl+C, mid-turn steering), cache evicts (release_clients), and session close (/new) kill/close the detached subagent mid-run, defeating the point of background mode. Lifecycle is owned by the async registry's interrupt_fn. - Make the capacity check atomic with the record insert (TOCTOU: two concurrent dispatches could both pass active_count() and exceed the cap). - TUI dedup: key async_delegation events by delegation_id — the fallthrough keyed them all as ("", type), suppressing every completion after the first in the desktop/TUI status feed. - CLI /stop now interrupts running background delegations and /agents lists them (they live outside the process registry and were invisible). - Drop stray unbalanced ']' line from the re-injection block and the unused _ASYNC_DEFAULT import. Tests: detach-at-dispatch + concurrent-capacity race added (15 total in test_async_delegation.py); 137 delegate + 140 process-registry/notify/watch + 7 TUI dedup tests pass. * fix(delegation): harden async background completion drains	2026-06-15 13:33:12 -07:00
Teknium	3e7e9b24d4	fix: harden salvaged session and browser improvements Polish salvaged contributor work before PR review: - read browser inactivity timeout from config with documented fallback - skip redundant v10 trigram backfill before v11 FTS rebuild - show delegate_task goals safely in progress previews - show gateway status model/context without redundant token wording - wire gateway /sessions to shared session-listing helpers - map Ravenwolf author emails for release attribution Co-authored-by: Wolfram Ravenwolf <github.com@wolfram.ravenwolf.de> Co-authored-by: Amy Ravenwolf <amy@ravenwolf.de>	2026-06-15 07:46:34 -07:00
Teknium	723c2331bd	fix: make profile subprocess HOME policy explicit	2026-06-14 03:20:21 -07:00
Teknium	6724daa2c2	fix: keep CLI idle timer ticking (#45592 )	2026-06-13 05:55:04 -07:00
H-Ali13381	2abcae9678	fix(cli): preserve renderer state on resize	2026-06-13 05:40:18 -07:00
Siddharth Balyan	7ba5df0d52	feat(billing): /credits command — balance + portal top-up handoff (#44776 ) * feat(billing): /usage → portal top-up browser handoff Add the terminal side of the billing slice (phase 2a): start a top-up by throwing the user to the portal billing page with the top-up modal open. The terminal does not confirm, poll, or track payment — checkout completes in the browser and the next /usage shows the new balance. - nous_account.py: parse organisation.slug/name from /api/oauth/account into NousPortalAccountInfo; add nous_portal_topup_url() building the org-pinned {base}/orgs/{slug}/billing?topup=open with a null-slug fallback to the legacy {base}/billing?topup=open (never /orgs/None/...). - portal_cli.py: 'hermes portal topup' — fresh account fetch, identity line (Topping up as <email> / org <name>), browser open with printed-URL fallback, no-wait closing copy. No polling/confirmation (deferred to 2b). - account_usage.py: the shared /usage credits block now links the org-pinned top-up URL (auto-opens the modal) + points to the command. Depends on NAS #409 (organisation.slug/name + ?topup=open). Do not merge until that is live on the target env; until then /api/oauth/account returns organisation: { id } only and the URL falls back to legacy. * feat(billing): /credits command for balance + top-up handoff Replace the standalone `hermes portal topup` subcommand with an in-session /credits slash command — a focused money surface (balance in, top-up out) that works in the CLI, TUI, and every messaging platform from one registry entry. - commands.py: register /credits (Info category). Slack is at its 50-slash cap, so /credits is routed via /hermes credits on Slack only (new _SLACK_VIA_HERMES_ONLY set) to avoid clamping a canonical command off the native list and breaking Telegram parity; native everywhere else. - account_usage.py: build_credits_view() — one portal fetch → balance lines + identity line + org-pinned top-up URL + depleted flag, consumed by all surfaces. Reuses the same snapshot/URL builder as /usage so numbers match. - cli.py: _show_credits() — balance block + identity line + 3-button panel (Open top-up / Copy link / Cancel) via the existing prompt_toolkit modal. ASK, never auto-launch; headless falls back to printing the URL. - gateway/slash_commands.py: _handle_credits_command() — renders the block + tappable top-up URL + no-wait copy; works on button and plain-text platforms. - /usage credits line now points to /credits. - Retire `hermes portal topup` (portal_cli.py back to baseline); the engine (slug/name parse + nous_portal_topup_url) stays as the shared core. No polling, no payment confirmation (billing phase 2a). Depends on NAS #409. * fix(credits): /credits works in the TUI slash-worker (non-interactive) In the TUI, /credits runs in the slash-worker subprocess where there is no live prompt_toolkit app and stdin is the JSON-RPC pipe. _show_credits called the 3-button modal unconditionally, which fell back to reading stdin → exception → slash.exec rejected → the command produced no output (only the pre-existing 'Credit access paused' banner showed). - _show_credits: when self._app is None (TUI worker / piped / non-interactive), render the text variant — balance block + tappable top-up URL + no-wait line, same affordance as the messaging surfaces — and skip the modal entirely. The 3-button panel still renders in the interactive CLI. - Depleted banner copy: 'run /usage for balance' → 'run /credits to top up' now that /credits is the dedicated money surface (+ tests). - Regression tests: _show_credits with self._app=None renders text and never invokes the modal; logged-out path. * feat(tui): credits.view RPC for the /credits tappable top-up button Add a credits.view JSON-RPC method returning the structured CreditsView (logged_in, balance_lines, identity_line, topup_url, depleted) so the TUI can render a clickable <Link> top-up button instead of plain text. Account- independent (portal fetch gated on a logged-in Nous account), fail-open to {logged_in: false} on any hiccup. Mirrors session.usage's credits-block pattern. Frontend (TUI-local /credits command + Ink component) lands separately. * feat(tui): /credits command with keyboard-driven top-up confirm TUI-local /credits: fetches the structured balance via the credits.view RPC, prints the balance + identity + top-up URL, then arms the EXISTING confirm overlay (Enter = open top-up in browser via openExternalUrl, Esc = cancel). Reuses ConfirmReq — no new overlay component/state/input handler. Headless (openExternalUrl returns false) falls back to printing the URL. - gatewayTypes.ts: CreditsViewResponse. - commands/credits.ts: the command (mirrors /status's rpc+guarded pattern). - registry.ts: register creditsCommands. - test: balance+overlay armed, headless fallback, no-url, logged-out (4 cases). Matches the CLI /credits 'Enter to open' affordance. Phase 2a: no polling.	2026-06-12 08:51:10 +00:00
Teknium	4474873d2c	feat(cli): persist resolved approval/clarify prompts in scrollback (#44702 ) Some checks failed Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Waiting to run Details Nix Lockfile Fix / fix (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run Details Typecheck / typecheck (apps/desktop) (push) Waiting to run Details Typecheck / typecheck (apps/shared) (push) Waiting to run Details Typecheck / typecheck (ui-tui) (push) Waiting to run Details Typecheck / typecheck (web) (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details Build Skills Index / build-index (push) Has been cancelled Details Build Skills Index / trigger-deploy (push) Has been cancelled Details Modal prompt panels (dangerous-command approval, clarify questions) live in the prompt_toolkit layout and vanish on the next repaint, leaving no trace of the question or the decision in chat history. Emit a dim one-line summary after each prompt resolves: ⚠ Approval: <command> → allowed for session ? Clarify: <question> → <answer> Gated on display.persist_prompts (default true). Detail and outcome are whitespace-collapsed and capped at 120 chars.	2026-06-12 01:14:35 -07:00
墨綠BG	81cdbbddc8	🐛 fix(cli): wrap approval preview hints	2026-06-11 23:05:08 -07:00
墨綠BG	d6df38bb6b	🐛 fix(cli): wrap long approval commands in prompt	2026-06-11 23:05:08 -07:00
Teknium	cb29e8a82e	refactor(cron): rebrand Cron Recipes -> Automation Blueprints Product rename across every surface: module/file names (blueprint_catalog, tools/blueprints, blueprint_cmd), slash command /cron-recipe -> /blueprint (alias /bp), dashboard API /api/cron/blueprints, desktop deep-link hermes://blueprint/<key>, docs catalog page + extract script, and the skill frontmatter block metadata.hermes.blueprint. No behavior change.	2026-06-11 10:49:47 -07:00
teknium1	e976faac7a	feat(cron-recipes): /cron-recipe <name> seeds a conversational fill Reworks the chat-line UX: pick a recipe by name and the agent asks you for what it needs, one question at a time, instead of forcing you to hand-type a slot=val command line. - /cron-recipe -> lists the catalog - /cron-recipe <name> -> forgiving name match (exact/prefix/substring/ fuzzy; ambiguous lists candidates), then seeds the agent with a natural-language fill request built from the recipe's typed slots + schedule and prompt templates. The agent asks for each value one at a time and calls the EXISTING cronjob tool. No new tool. - /cron-recipe <name> slot=val -> unchanged deterministic path (fill_recipe -> create_job) for the dashboard/docs/power user. Mechanism (no new plumbing, invariant-safe — the seed enters as a normal user turn, never a synthetic injection): - shared handler returns RecipeCommandResult{text, agent_seed}; match_recipe() and build_recipe_seed() are the new shared pieces. - gateway: dispatch rewrites event.text to the seed and falls through to the agent (the same pattern /steer uses). - CLI: handler sets a one-shot self._pending_agent_seed; the interactive loop consumes it right after process_command() and runs it as the next turn. The typed-slot schema stays the single source of truth (still validates the form/inline path via fill_recipe); the agent path just renders those slots into the questions to ask. Docs updated to lead with the name-then-ask flow.	2026-06-11 10:49:47 -07:00
teknium1	1593ca5406	feat(cron): Cron Recipes — parameterized automation templates across every surface A 'recipe' is a one-place definition of an automation that every surface renders natively. The slot schema (cron/recipe_catalog.py) is the single source of truth; four renderers consume it, and all paths end at the same cron.jobs.create_job — no second job engine. Form where there's a screen, conversation where there's a chat line: - Dashboard / GUI app: a Recipes sub-tab on the Cron page renders each recipe's typed slots as a form (time-picker, enum dropdown, free-text); submit POSTs /api/cron/recipes/instantiate which fills + creates the job. - CLI / TUI / messengers: /cron-recipe lists the catalog, shows a recipe's fields, or fills + creates from a pasted 'key slot=val' command. The shared handler (hermes_cli/cron_recipe_cmd.py) names any missing/invalid slot so the agent can ask a targeted follow-up. - Docs: a generated Cron Recipes catalog page (website, .mdx + React cards) shows each recipe with a copy-paste command and a 'Send to App' button. - Desktop: a hermes:// URL scheme (Electron single-instance lock + setAsDefaultProtocolClient + open-url/second-instance) routes hermes://cron-recipe/<key>?slot=val into the chat composer pre-filled. Typed slots (time/enum/text/weekdays) with defaults: users never type raw cron — recipes parameterize time-of-day and weekday sets and translate to cron expressions; a free-text 'schedule' slot is the full-flexibility escape hatch. Consent-first throughout: nothing schedules without an explicit submit or send. Core: - cron/recipe_catalog.py — CronRecipe + RecipeSlot, 5 curated recipes, recipe_form_schema / recipe_slash_command / recipe_deeplink / recipe_catalog_entry renderers, fill_recipe (validate + translate to create_job kwargs). - hermes_cli/cron_recipe_cmd.py — shared /cron-recipe handler (CLI + TUI + gateway never drift). CommandDef + dispatch in commands.py / cli.py / gateway/run.py. Dashboard: GET /api/cron/recipes + POST /api/cron/recipes/instantiate (web_server.py), CronRecipes.tsx gallery+form, Segmented sub-tab on CronPage, api.ts methods + types. Desktop: hermes:// scheme end to end (main.cjs deep-link router + ready-queue, preload onDeepLink/signalDeepLinkReady, global.d.ts types, desktop-controller composer prefill, electron-builder protocols key). Docs: extract-cron-recipes.py generator wired into prebuild.mjs, cron-recipes-catalog.mdx + CronRecipesCatalog React component, sidebar entry. Generated index json gitignored like skills.json. Tests: 23 core (catalog/slots/schedule-resolution/validation/renderers/command handler/generator) + 5 web_server endpoint tests. E2E verified end to end: slot fill -> create_job -> persisted job with correct schedule/deliver/origin.	2026-06-11 10:49:47 -07:00
Teknium	8972a151a4	feat(cli,tui): show time since last final agent response on the status bar (#44265 ) Adds an idle clock to the context/status bar in both the prompt_toolkit CLI and the Ink TUI: once a turn completes, a dim '✓ <elapsed>' segment shows how long the session has been idle since the last final agent response. Hidden while a turn is live (the per-prompt elapsed timer covers that) and before the first turn completes. - cli.py: track _last_turn_finished_at when the agent thread exits, surface it via _format_idle_since() in the snapshot, render in both the wide fragments path and the plain-text fallback. - ui-tui: stamp lastTurnEndedAt when busy flips false after a live turn, thread it through appStatus -> StatusRule, render via a ticking IdleSince segment sharing the duration breakpoint/width budget.	2026-06-11 06:06:19 -07:00
Teknium	b8e2c16579	Merge origin/main into salvage branch (resolve AUTHOR_MAP conflict)	2026-06-10 23:25:54 -07:00
Matt Van Horn	0b5b7ddfd2	fix(cli): show quick commands in /help output User-defined quick_commands from config.yaml now appear in the /help output under a "Quick Commands" section, between skill commands and tips. Fixes https://github.com/NousResearch/hermes-agent/issues/4090 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-06-10 22:55:52 -07:00
brooklyn!	3e74f75e41	feat(agent): coding-context posture across CLI/TUI/desktop/ACP (#43316 ) * feat(agent): coding-context posture with per-model edit-format tuning Hermes detects when it's running in a coding context — an interactive surface (CLI, TUI, ACP, desktop) sitting in a code workspace (git repo or recognised project root) — and shifts into a coding posture. Outside that (chat platforms, non-workspaces) nothing changes. The posture is modelled as a frozen RuntimeMode selected from a small ContextProfile registry (coding/general). A profile is data: the toolset to collapse to, the operating brief to inject, and seams for model routing and memory. Every domain reads the same resolved object instead of re-probing git/config on its own: - System prompt — RuntimeMode.system_blocks(): an operating brief (gather context before editing, edit through tools not chat, verify with terminal, cap retry loops) plus a live git/workspace snapshot, built once and baked into the stable prompt tier so per-conversation caching is preserved. - Per-model edit-format tuning — the brief nudges each model family toward the patch mode it handles best: OpenAI/Codex toward mode='patch' (V4A multi-file diffs), Anthropic toward mode='replace' (string replacement). The model id rides on RuntimeMode; unknown families keep neutral wording. - Skill index — non-coding skill categories are pruned from the prompt's skill index (discovery-only; skills_list/skill_view still reach the full catalog, with a disclosure note). - Toolset — only under the opt-in 'focus' mode does the posture collapse to the coding toolset + enabled MCP servers; the default posture is prompt-only and never overrides configured toolsets. Activation via agent.coding_context: auto (default), focus, on, off. Subagents inherit the posture for free via toolset inheritance + the shared prompt builder. Detection is not memoized so a long-lived gateway/TUI process can't pin a stale posture across working directories. * feat(agent): cover new-file authoring in the coding edit-format nudge The per-model edit-format guidance only addressed editing existing code (patch mode='patch' vs 'replace'), but authoring a brand-new file — write_file, not patch — is a large fraction of real coding work and the nudge was silent on it. Surfaced when building a single-file artifact where the dominant operation was write_file and the steering offered no guidance. Both family lines now lead with "author new files with write_file; for edits to existing code prefer ...". Tests assert write_file appears in each family's brief; unknown families still get neutral wording. * docs(agent): correct memoization docstring + clarify TUI config-load asymmetry * feat(agent): sharpen the coding posture — verify-loop facts, wider edit steering, $HOME guard Tuning pass on the coding posture from dogfooding it as a harness: - Workspace snapshot now hands the model its verify loop up front: detected manifests + package manager (lockfile sniff), the exact verify commands (package.json scripts, Makefile targets, scripts/run_tests.sh, pytest config), and which context files (AGENTS.md / CLAUDE.md / .cursorrules) exist at the root. Marker-only (non-git) projects get the snapshot too instead of nothing. The "verify before claiming done" brief line was the highest-value piece in evals — this turns it from advice into an executable loop instead of making the model rediscover the test command every session. Still stat-cheap, size-guarded reads, built once at prompt time. - Edit-format steering covers the families Hermes actually serves: Gemini and open-weight coding models (DeepSeek, Qwen, Kimi, GLM, Grok, Hermes, Llama, Mistral, Devstral, MiniMax) steer to mode='replace' — their RL scaffolds use str_replace-style editors. Previously only GPT/Codex and Claude families got steering; the models Hermes users disproportionately run all fell to neutral. - Operating brief gains four behaviors elite harnesses encode: batch independent reads/searches in one turn; fix root causes and the bug class (sibling call paths), not the reported site; no drive-by refactors/renames/reformatting; never read, print, or commit secrets. Plus a patch-failure escalation ladder: after the same region fails twice, rewrite the enclosing function/file with write_file instead of a third patch attempt. - $HOME dotfiles guard: a git repo rooted exactly at the home directory (or a marker sitting in it, e.g. a global ~/AGENTS.md) is user config, not a code workspace — without the guard, every session anywhere under a dotfiles-managed home silently flipped to the coding posture. Real projects under such a home still detect via their own markers/repos; 'on' mode bypasses the guard.	2026-06-10 23:06:44 -05:00
Teknium	4490c7cf8d	fix: in-memory transcript blocks empty-session prune CI caught tests/cli/test_cli_new_session.py asserting that /new keeps the old session row when conversation history exists in memory. The live transcript is authoritative: a session whose messages haven't flushed to the DB yet (or whose flush failed) must not be pruned. Guard _discard_session_if_empty on self.conversation_history and pin the behavior with a test.	2026-06-10 17:37:34 -07:00
Teknium	e96ca1a0d3	feat(sessions): drop empty sessions on CLI exit and session rotation Port from google-gemini/gemini-cli#27770: starting the CLI and immediately quitting (or rotating with /new, /clear) left an empty untitled session row behind. These ghost rows pile up in /resume, `hermes sessions list`, and the in-chat recent-sessions browser. - SessionDB.delete_session_if_empty(): transactional check-and-delete that only removes rows with no messages, no title, and no child sessions (delegate subagent parents are preserved). Also removes on-disk transcript files via the existing _remove_session_files. - HermesCLI._discard_session_if_empty(): thin wrapper, wired into the cli_close shutdown path and the new_session() rotation path. Skipped when /exit --delete already handles removal. Unlike the one-shot prune_empty_ghost_sessions migration (TUI-only, 24h-old rows), this prevents new ghost rows from accumulating at the moment they would be created.	2026-06-10 17:22:27 -07:00
Robin Fernandes	af978ecb17	fix(model): require confirmation for expensive model selections Rebased onto current main and re-ported across the restructured surfaces: model flows now thread confirm_provider/base_url/api_key through hermes_cli/model_setup_flows.py, the Discord picker lives in plugins/platforms/discord/adapter.py, and the web dashboard picker applies chat-mode switches via config.set so the expensive-model confirmation can ride the response. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 00:24:06 -07:00
mnajafian-nv	f8fd30942c	fix(cli): prevent duplicate one-shot finalize on interrupted cleanup (#43320 ) Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>	2026-06-09 22:41:04 -07:00
mnajafian-nv	d03cdd63eb	fix(cli): run one-shot query cleanup before lease release (#43036 ) * fix(cli): run one-shot query cleanup before lease release Signed-off-by: mnajafian-nv <mnajafian@nvidia.com> * test(cli): cover quiet one-shot cleanup finalization Signed-off-by: mnajafian-nv <mnajafian@nvidia.com> --------- Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>	2026-06-09 21:52:13 -07:00
Teknium	96af61b6ef	feat(memory,skills): approve/deny gate for memory + skill writes (#38199 ) Adds memory.write_mode and skills.write_mode (on\|off\|approve), applied to both foreground turns and the background self-improvement review fork — the source of the unprompted 'wrong assumption' saves users reported. - on (default): write freely, unchanged behaviour - off: never write; the tool returns a clean disabled result - approve: don't commit. Memory foreground writes prompt inline (small, reviewable in a chat bubble); background memory writes and ALL skill writes stage to a pending store instead (a SKILL.md is too large to review inline, and a daemon thread can't block on a prompt) Review staged writes from CLI or any messaging platform: /memory pending\|approve\|reject\|mode /skills pending\|approve\|reject\|diff\|mode Skill review respects the size asymmetry: inline you see a one-line gist; the full unified diff stays out-of-band (/skills diff, dashboard, or the staged JSON file). New: tools/write_approval.py (gate + pending store), hermes_cli/ write_approval_commands.py (shared CLI+gateway handlers). Gates wired at the single entry points memory_tool() and skill_manage(), using the existing write-origin ContextVar to distinguish foreground from background_review.	2026-06-09 21:51:43 -07:00
Austin Pickett	b8eede7bda	fix(cli): /plugins shows installed-but-not-enabled plugins The /plugins slash command read from the live PluginManager, which only knows about loaded plugins. A freshly-installed plugin that hadn't been enabled yet showed 'No plugins installed. Drop plugin directories into ~/.hermes/plugins/' — even though it was on disk and a valid plugin. Switch to the same disk-discovery path as 'hermes plugins list' (_discover_all_plugins + enabled/disabled sets + _plugin_status), so an installed plugin now appears with its activation state ([not enabled], enabled, or disabled) plus the exact enable command. Default the quick /plugins view to user-installed plugins and summarize bundled providers/platforms on one line (the full catalog stays behind 'hermes plugins list') so the output isn't drowned by 60+ bundled provider plugins.	2026-06-09 10:49:43 -07:00
firefly	ab98818e5b	fix(cli): use the confirm modal on native Windows instead of deadlocking input() Native Windows bypassed the destructive-slash modal and fell back to a raw input() prompt. When the confirm was triggered from the process_loop daemon thread (the normal case), that input() deadlocked against prompt_toolkit's main-thread stdin ownership: bare /reset froze with Ctrl-C swallowed, while /reset now worked only because it skips the prompt. Route native Windows through the existing call_soon_threadsafe modal path (the same key-binding channel that already handles normal typing on Windows); keep the stdin fallback only for the safe no-app / scheduling-failure cases, and clean-cancel (None) off the main thread on win32 so a degraded path never re-deadlocks. Addresses #33961 Refs #30768	2026-06-08 15:53:28 -07:00
Robin Fernandes	639c1e3636	feat(sessions): add optional max session cap	2026-06-08 15:12:12 -07:00
BarnacleBoy	550b72dd87	fix(cli): gate tool-rendering paths with tool_progress_mode, not quiet_mode quiet_mode was being used to suppress tool-result display when tool_progress_mode was 'off'. But quiet_mode also gates operational status messages, so users with /verbose + tool-progress off lost all status output. Adds a dedicated tool_progress_mode attribute to AIAgent; the tool_executor result-rendering path gates on tool_progress_mode != 'off'. The CLI passes its tool_progress_mode through agent setup and the tool-progress cycle command syncs it onto the live agent. Fixes #33860.	2026-06-08 11:29:53 -07:00
Robert Ban	4129092fda	fix(cli): strip OSC 8 hyperlink sequences in ChatConsole output prompt_toolkit's ANSI parser does not handle OSC escape sequences (\x1b]...\x07 / \x1b]...\x1b\), which caused Rich's [link=...] markup to leak raw OSC 8 payload into the banner title after /clear. Added _OSC_ESCAPE_RE to strip OSC sequences in ChatConsole.print() before routing through _cprint(). CSI/SGR color sequences are preserved. Visible text between OSC sequences is kept intact.	2026-06-08 11:29:53 -07:00
teknium1	094aa85c37	refactor(cli): extract agent-construction cluster into CLIAgentSetupMixin (god-file Phase 4) Lift the 5 agent-construction/session-resume methods out of HermesCLI into hermes_cli/cli_agent_setup_mixin.py:CLIAgentSetupMixin. Behavior-neutral; cli.py 14139 -> 13492 LOC. Methods moved (~647 LOC): _ensure_runtime_credentials, _resolve_turn_agent_config, _init_agent, _preload_resumed_session, _display_resumed_history. All self.* calls resolve unchanged via the MRO (HermesCLI(CLIAgentSetupMixin, CLICommandsMixin)). Import split (same recipe as #41942): 2 neutral deps (sys, _escape) imported at the mixin module top; 12 cli.py-internal helpers/constants (AIAgent, ChatConsole, CLI_CONFIG, _cprint, _DIM, _RST, _accent_hex, ...) imported lazily per-method (from cli import ...) so the mixin never imports cli at module scope -> no cycle. Repointed one source-inspection change-detector (test_callable_api_key.py) to read the mixin file where the method now lives.	2026-06-08 09:41:34 -07:00
teknium1	0904bc7ea2	refactor(cli): extract 32 slash-command handlers into CLICommandsMixin (god-file Phase 4) Lift the `_handle_*_command` cluster (2,077 LOC) out of HermesCLI into hermes_cli/cli_commands_mixin.py; HermesCLI now inherits CLICommandsMixin so every self.<handler> call resolves unchanged via the MRO. Behavior-neutral. Import discipline mirrors gateway/slash_commands.py (PR #41886): neutral deps imported at the mixin module top level; cli.py-internal helpers/constants (_cprint, _ACCENT, save_config_value, ...) imported lazily inside each handler via 'from cli import ...' so the mixin never imports cli at module scope. cli.py 16215 -> 14139 LOC. One test mock repointed (cli.is_browser_debug_ready -> hermes_cli.cli_commands_mixin.is_browser_debug_ready).	2026-06-08 02:13:07 -07:00
kshitijk4poor	8e71b5136b	fix(cli): paint approval/clarify/sudo/secret modal prompts directly, not via the throttle (#41098 ) In classic CLI mode the dangerous-command approval prompt (and the clarify, sudo, and secret-capture prompts) could fail to render: the user saw '⏱ Timeout — denying command' after 60s without ever seeing the panel, making approvals.mode: manual unusable. Root cause. These prompts run their wait loop on the agent/background thread: they set modal state that a ConditionalContainer's filter reads, then call self._invalidate() to repaint so the panel appears. _invalidate() is a THROTTLED wrapper built for high-frequency background repaints (spinner frames, streaming) — it (a) returns early while a SIGWINCH resize-recovery is pending, and (b) otherwise only repaints if 250ms elapsed since the last paint. Under either condition the modal's entry paint is silently dropped, the ConditionalContainer never re-evaluates, and the prompt times out unseen. The throttle never belonged on these paths. Originally the callbacks painted with a direct self._app.invalidate() and worked; a throttle PR blanket-replaced every invalidate (including these rare, one-shot, user-blocking modal paints) with the throttled _invalidate(); a later commit removed an idle 1Hz repaint that had been masking dropped modal paints, surfacing the bug. Notably the modal KEY-BINDING handlers (↑/↓/Enter) already paint with a direct event.app.invalidate(), never the throttle — the background-thread callbacks were the inconsistent ones. Fix. Add a small _paint_now() helper that paints directly (guarded for a missing _app, exception-safe) and route the four modal paths' entry, response, countdown, and teardown paints through it — matching the key-handler idiom. This covers approval, clarify, sudo, and the secret-capture teardown (_submit_secret_response, which previously used the throttled _invalidate() so its panel could linger after submit). _invalidate() is left untouched and its docstring now states it is for high-frequency background repaints only; modal/interactive paints must use _paint_now()/_app.invalidate() directly. This also fixes the resize-recovery edge case for free (a direct paint never consults the resize guard) without a throttle-bypass flag that could be cargo-culted onto hot paths. Countdown refresh cadence tightened 5s->1s so the timer stays visible while waiting, and a copy-pasted duplicate countdown block in _clarify_callback is removed. Tests: TestModalPaintNow drives all three wait-loop callbacks on a background thread with BOTH gates active (_resize_recovery_pending=True + a recent _last_invalidate in the throttle window) and asserts the panel paints on entry AND repaints on teardown; plus a secret-teardown test, a direct _paint_now-vs-_invalidate gate test, and a no-_app safety test. Each modal test fails if its paint is reverted to _invalidate(). 17 in-file tests pass; full tests/cli suite green (900). Diagnosis credit: the throttle-drop root cause was identified by @sanidhyasin in #41116; @islam666 independently reached the same direct-invalidate approach in #41166; original report #41098 by @jodonnel.	2026-06-08 00:46:43 +05:30
Teknium	136dae779e	fix(cli): return bool (not None) when a destructive-slash confirmation is cancelled (#40583 ) process_command() is typed -> bool, but the /clear, /new, and /undo cancel paths did a bare `return` (None) when _confirm_destructive_slash was declined, leaking None through the bool contract. Return True (command handled, keep the REPL alive) on cancel. Co-authored-by: yubingz <yubingz@users.noreply.github.com>	2026-06-07 02:49:28 -07:00
Siddharth Balyan	fcb1944b4f	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 ) Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Waiting to run Details Nix Lockfile Fix / fix (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details * feat(tui): HERMES_DEV_CREDITS live-spend dev readout (L0 tracer for usage-aware credits) L0 of the usage-aware-credits feature: a dev-only, env-gated tracer that exercises the real header -> CreditsState -> TUI pipe end-to-end behind HERMES_DEV_CREDITS, de-risking the L1/L5 build before the notice policy exists. - agent/credits_tracker.py: CreditsState + parse_credits_headers (headers are strings -> paid_access via == "true", never bool(); retain-last-known; only subscription_micros may be negative; _usd kept verbatim). - run_agent.py: _capture_credits / get_credits_state / get_credits_spent_micros, session-start baseline latch, + dev-gated "credits" capture log. - agent/chat_completion_helpers.py: capture on the streaming response. - agent/agent_init.py: init _credits_state + _credits_session_start_micros. - tui_gateway/server.py: _get_usage emits dev_credits_spent_micros only when flagged. - ui-tui appChrome.tsx / types.ts: cents delta status segment + "(dev credits)" banner. Off by default; silent for normal users. Validated live against staging (capture log delta matches the TUI segment). Throwaway consumer (readout/log/ banner); credits_tracker + the capture plumbing are the real feature foundation. test(credits): lock parser under 9-state matrix + harden validation (L2) Add tests/agent/test_credits_tracker.py with 92 tests covering the 9-state matrix (healthy, sub_90pct, grant_exhausted, purchased_only, tool_pool_free, depleted, debt, missing, no_org) plus validation edge cases: version strict==1 with warn-once latch for v>1, bool-string trap (paid_access/tool_pool_gated_off == "true"/"false", never bool()), half-pair subscription limit treated as both-absent while parse succeeds, USD regex ^-?\d+\.\d{2}$, non-int micros → None, negative non-subscription micros → None, as_of_ms junk → None, zero limit ZeroDivision guard. Harden agent/credits_tracker.py to match the spec: - Add tool_pool_micros/tool_pool_gated_off/from_header fields to CreditsState - Add depleted property (== not paid_access, never remaining==0) - Change used_fraction guard to key off subscription_limit_micros (the actual denominator) not denominator_kind (metadata) - Replace fail-soft _safe_int with a sentinel-returning variant; full validation now returns None on any malformed field rather than silently defaulting - Add module-level warn-once latch for version > 1 - Add USD regex validation; add denominator_kind allow-list check - Parse x-nous-tool-pool-* prefix headers (not x-nous-credits-tool-pool-) feat(credits): notice spine — AgentNotice + notice_callback/notice_clear_callback + TUI binding (L1) L1 of usage-aware credits: the driver-agnostic notice delivery spine that L4's policy will fire through and L5's TUI render will consume. - agent/credits_tracker.py: AgentNotice dataclass (text/level/kind/ttl_ms/key/id; kind defaults "sticky", kept TTL-expressive for a future config seam). - run_agent.py: AIAgent gains notice_callback + notice_clear_callback slots and _emit_notice / _emit_notice_clear emitters (swallow all callback errors — a notice must never break the agent loop; no-op when unbound). - agent/agent_init.py: thread both callbacks through init_agent. - tui_gateway/server.py: bind both in _agent_cbs → notification.show / notification.clear WS events (snake_case payload, matching the existing gateway-event convention). - ui-tui/src/gatewayTypes.ts: notification.show / notification.clear arms on GatewayEvent. - tests/run_agent/test_notice_spine.py: 15 tests (emitter fire + fail-open + no-op, signature threading, TUI binding payload shape). Messaging push is out of v1 (binds neither callback). CLI binding + the TUI render/ decode land with L4 (firing) and L5 (render) so turn-end flush is wired correctly. * feat(credits): threshold reconciliation policy + tests (L4.1) * feat(credits): wire threshold policy into capture + latch (L4.2) After a fresh header parse, _capture_credits runs evaluate_credits_notices against the agent's _credits_latch and emits the result — clears first, then shows (so a recovered depletion clears before the "restored" success lands, and depleted wins the latest-wins slot). Gated on a bound notice_callback: messaging (no callbacks) still caches state for /usage but runs no policy. Parse stays fail-open (miss → keep last-known); the eval/emit path warns on failure rather than swallowing, so a depletion-notice bug can't vanish silently. - run_agent.py: _capture_credits split into parse (swallow→miss) + policy (warn); latch lazy-guarded (object.__new__ safety). - agent/agent_init.py: init agent._credits_latch = {"active": set(), "seen_below_90": False}. * feat(tui): render credits notices in the status bar (L5, Strategy B) The TUI now renders the notification.show / notification.clear gateway events the agent emits — a level-colored notice overrides the status/verb slot when not busy. - Notice state machine on turnController (pendingNotice + dedicated noticeTimer + show/clear/applyNotice/flushPendingNotice/clearNoticeState). createGatewayEventHandler decodes the events and delegates. - Render priority busy > notice > status (appChrome StatusRule); notice text rendered verbatim (its glyph comes from the policy), shrinkable so it never clips model│ctx; dev-credits banner + Δ segment preserved. UiState.notice is snake_case (matches wire). - Busy-wins: a notice arriving mid-turn is held and flushed at the THREE turn-end sites (recordMessageComplete / interruptTurn / recordError) — never idle(), which reset() also calls (would leak across sessions); reset() clears instead. - Dedicated noticeTimer (never statusTimer); TTL starts on visibility with an id-guard; latest-wins cancels the prior timer; clear is key-matched (no-op on mismatch); a sticky survives a turn (flush no-ops with no pending); session reset clears (no cross-session leak). - 20 tests (handler/turnController logic incl. R3-C2 timer isolation + render priority). * feat(credits): cold-start seed for new Nous sessions (L3) A genuinely-new Nous session has no inference header yet, so seed credits state from the authoritative GET /api/oauth/account snapshot at session start (in the new-session branch of _restore_or_build_system_prompt — inline, since the on_session_start plugin hook gets no agent reference). The seed runs the shared notice policy, so a session that opens already depleted warns IMMEDIATELY rather than only after the first turn. - Maps the nested account fields (paid_service_access → paid_access; total_usable / subscription / purchased on paid_service_access_info; rollover on subscription), each None-guarded; float dollars → micros via round(d1e6), _usd left "" (render formats from micros — never synthesize a verbatim usd from a float). - Magnitudes-only: no monthlyCredits on the endpoint → subscription_limit_* unset → used_fraction None → no warn90 from the seed (% only once a header lands, per D-E). - Provider-guarded to Nous; fail-open (any error leaves _credits_state None, never blocks startup); paid_access unknown ⇒ True (never falsely depleted). - run_agent.py: extracted the warm-path policy/emit block into a shared _emit_credits_notices() so capture and the seed fire notices identically. * feat(credits): /usage Nous credits magnitudes view + recovery trigger (L6) Add Nous credit dollar magnitudes to /usage (subscription / top-up / total + rollover + renewal + portal CTA), magnitudes-only per v1 (no % until the account endpoint exposes a denominator). Reuses the existing account-usage render machinery via a new pure build_nous_credits_snapshot() that maps a NousPortalAccountInfo to an AccountUsageSnapshot; no nous branch is added to fetch_account_usage (keeps the per-provider boundary intact). CLI /usage also doubles as a depletion-recovery trigger: a force_fresh account fetch, kept in a SEPARATE local so it never clobbers the header-sourced agent._credits_state (which alone carries used_fraction). If paid access recovered while credits.depleted is latched and a notice consumer is bound, it reuses agent._emit_credits_notices() to clear it. Gateway /usage displays magnitudes only — messaging binds no notice consumer, so it performs no recovery emit. Fail-open throughout: any portal hiccup leaves /usage unaffected. * refactor(credits): dedupe HERMES_DEV_CREDITS flag parse via shared helpers The dev-flag truthy check was inlined in three places. Replace with the shared utils.is_truthy_value (run_agent.py, tui_gateway/server.py — also drops a redundant inline `import os`) and a hoisted DEV_CREDITS_MODE export in ui-tui/src/config/env.ts (consumed by appChrome, which also stops recomputing the env check on every render). Behaviour-preserving; identical truthy set. * fix(credits): cut dead /usage recovery trigger + bound portal fetches (L6 review) Adversarial review found the /usage depletion-recovery trigger dead AND broken: the CLI binds no notice_clear_callback, the TUI runs /usage in a separate slash-worker subprocess (its own agent/latch), and the no-clobber rule made it evaluate stale paid_access anyway. Recovery already happens on the next inference (warm path), so the trigger was redundant — remove it and stop the depleted notice over-promising. - cli.py: remove the dead recovery block; bound the /usage portal fetch with a 10s wall-clock timeout (ThreadPoolExecutor) like the per-provider fetch — urllib's per-socket timeout is not a wall-clock guarantee. - agent/credits_tracker.py: reword the depleted CTA to "run /usage for balance" (no false recovery promise; /usage shows fresh magnitudes, sticky clears next turn). - agent/conversation_loop.py: same wall-clock timeout on the cold-start seed fetch so a stalled portal can't hang session startup; tidy its time import. * chore(credits): dev notice-state fixtures (HERMES_DEV_CREDITS_FIXTURE) Throwaway dev scaffolding to exercise the notice pipeline without real spend or Redis seeding. Set HERMES_DEV_CREDITS_FIXTURE to a state name (healthy / sub_90pct / grant_exhausted / depleted / clear) or a file path whose contents name a state (re-read each turn → flip states live for recovery testing). _capture_credits injects the chosen CreditsState instead of parsing real headers and runs the shared notice policy. Deletable with the rest of the HERMES_DEV_CREDITS scaffolding. * feat(credits): /usage monthly-grant % gauge The portal /api/oauth/account subscription block now carries monthly_credits (the per-period grant allowance, the % denominator). The consumer parsed monthly_charge but dropped monthly_credits, so /usage stayed magnitudes-only. Capture monthly_credits into NousPortalSubscriptionInfo + _subscription_from_payload. build_nous_credits_snapshot emits a Subscription usage window (real % used, routed through the existing render machinery) when monthly_credits is a finite positive denominator and credits_remaining is finite and <= cap; otherwise it degrades to magnitudes-only (older portals, rollover-over-cap, or non-finite payloads). Guards (adversarial-review-driven): reject non-finite operands (json.loads parses bare NaN/Infinity by default → would render $nan + a false 100% used), reject bools, guard div-by-zero (cap>0), and suppress the gauge when remaining > cap (rollover spanning the period makes the cap a nonsensical denominator → the $X-of-$Y detail would read as a contradiction). Debt (remaining<0) clamps to 100%. Money rule preserved: the ratio + magnitudes are computed from numeric float account fields via display formatting, never by parsing a server _usd string (there are none on these dataclasses). 13 gauge tests added (tests/agent/test_nous_credits_gauge.py). fix(credits): show /usage Nous block whenever a Nous account is present /usage runs in a slash-worker subprocess whose resolved inference provider is often not "nous" even when the user has a Nous account, so gating the Nous credits block on (provider == "nous") hid it entirely — the account data was fully available but never rendered. Gate instead on "a Nous account is logged in": a cheap local auth-state lookup (get_provider_auth_state('nous') has an access_token) decides whether to attempt the portal fetch, regardless of which provider inference runs on. In the gateway the block is also lifted out of the 'if provider:' scope so a Nous-credentialled user with another (or no) resident inference provider still sees their balance. Fail-open and the per-fetch wall-clock timeout are preserved. * fix(credits): show /usage Nous block when there's no live agent (TUI slash-worker) In the TUI, /usage runs in a slash-worker subprocess that resumes the session WITHOUT building an agent (self.agent is None), so _show_usage early-returned "(._.) No active agent" before ever reaching the Nous credits block — which is agent-independent (a portal fetch gated on Nous auth-state). Extract the block into _print_nous_credits_block() and run it at the no-agent / no-calls early-returns too (returns True if it printed, so the fallback message only shows when there's genuinely nothing). Verified live against staging: the block + monthly-grant gauge now render in the slash-worker /usage path (previously hidden). The plain CLI REPL + messaging paths are unchanged (they have a live agent). * feat(credits): escalating 50/75/90 usage bands (single status line) Replace the lone 90%-used warning with three escalating bands (50 info, 75 warn, 90 warn) shown as ONE status-bar line: it displays the highest band the subscription grant has crossed, replaces the line as usage climbs, steps back down on recovery, and clears below 50%. No stacking, no per-turn churn. Bands live in a tunable CREDITS_USAGE_BANDS list; the policy derives everything from it. Single notice key (credits.usage) with a usage_band latch field so the notice only re-emits when the band actually changes. The crossing gate (seen_below_90) is preserved so a fresh live session that opens mid-range stays quiet until it has been observed below the lowest band (cold-start primes it when it wants an open-high warning). Denominator math unchanged: % = subscription grant burn (cap - grant_remaining)/cap, clamped [0,1]; top-up never moves the %. Migrated test_credits_policy.py to the new key + added TestUsageBands (climb, step-down, recovery-clear, idempotent, inclusive boundaries). * feat(credits): hydrate notices at session OPEN via shared seed (TUI + first-turn) Notices previously only fired inside a conversation turn (first message), so a session that opened already depleted / past a usage band showed nothing at 'ready'. Extract the cold-start seed into a shared seed_credits_at_session_start() and call it (a) in the TUI/desktop agent build right after the notice callback is wired (fires at 'ready', before any message) and (b) as the first-turn fallback in conversation_loop. Idempotent (skips once _credits_state exists) and fail-open. The seed now maps monthly_credits -> subscription_limit_micros + denominator_kind='subscription_cap', so used_fraction is computable at seed time and usage-band warnings (not just depletion) hydrate on open. Primes the crossing latch so a session opening already in a band warns immediately. Degrades to depletion-only when monthly_credits is absent (older portals). Adds test_credits_cold_start.py covering open-at-band, depletion, debt, no-cap degradation, and the shared seed (fires/idempotent/skips-non-nous). * feat(credits): /usage monthly-grant % gauge + fixture support + TUI surfacing agent/account_usage.py: build_nous_credits_snapshot emits a subscription %% gauge when the portal supplies a positive, finite monthly_credits denominator with remaining <= cap (guards reject NaN/Infinity and rollover-over-cap, which would render $nan or a contradictory $X-of-$Y); degrades to magnitudes-only otherwise. Adds shared nous_credits_lines() (auth-gated, wall-clock-bounded portal fetch) so the CLI and TUI /usage render the same block, and _snapshot_from_credits_state() so HERMES_DEV_CREDITS_FIXTURE drives /usage offline too. TUI: session.usage RPC carries credits_lines (agent-independent) and the /usage panel renders them regardless of API-call count or resume state — previously the TUI's separate /usage implementation only showed token counts. Money rule preserved: %% and magnitudes come from numeric float account fields via display formatting, never by parsing a server _usd string. feat(credits): CLI REPL inline notices (parity with TUI) The plain CLI agent bound no notice callbacks, so credit notices were TUI-only. Bind notice_callback/notice_clear_callback on the CLI AIAgent; _on_notice renders a single level-colored line above the prompt (error red / warn yellow / success green / info dim) via _cprint, and seed credits at session open so a depletion or usage-band warning shows before the first message — the same hydration the TUI got. _on_notice_clear is a no-op (the REPL prints lines, no persistent slot). * test(credits): add sub_50pct + sub_75pct dev fixtures for the new usage bands The fixture set jumped 10%% -> 90%%; add sub_50pct (uf 0.5 -> band 50 info) and sub_75pct (uf 0.75 -> band 75 warn) so the new escalating bands are exercisable via HERMES_DEV_CREDITS_FIXTURE across all three surfaces (notice, session-open seed, /usage gauge). * fix(credits): usage-band notice clears on next prompt (not sticky-forever) A 50/75/90 usage heads-up was sticky and camped the status bar indefinitely. Clear the visible credits.usage notice when a new turn starts (startMessage), so it shows until your next prompt then yields. The server latch is unchanged, so it won't re-nag at the same band — it only re-shows when the band actually changes (climb) or clears when usage drops below the lowest band. Depletion stays sticky. * refactor(credits): consolidate the /usage credits block behind nous_credits_lines() The CLI (_print_nous_credits_block) and the messaging gateway (_handle_usage_command) each re-implemented the auth-gate + portal fetch + render, and both bypassed the dev-fixture short-circuit that only the TUI honored — so /usage ignored HERMES_DEV_CREDITS_FIXTURE on the CLI and in chat. Route both through the shared agent.account_usage.nous_credits_lines() helper: one fetch/render path, one auth gate, and the fixture works on every surface (~60 fewer duplicated lines). The gateway usage test recorded only the last asyncio.to_thread call; /usage now dispatches both the account fetch and the credits fetch, so it records every call and matches the account fetch by its provider arg. * fix(credits): keep the /usage gauge type-safe and log its fail-open path _is_finite_num is now a TypeGuard[float], so the type checker narrows the gauge operands (monthly_credits / credits_remaining) and the magnitudes passed to _fmt_usd through it — no more None-operand warnings on the arithmetic. Add a debug breadcrumb on the nous_credits_lines portal-fetch fail-open so a dead /usage block is diagnosable in agent.log without a dev flag. * fix(credits): harden the header tracker — prod-leak gate, hot-path probe, fire-and-forget seed - Prod-leak guard: dev fixtures (HERMES_DEV_CREDITS_FIXTURE) now also require HERMES_DEV_CREDITS, so a stray fixture var can't surface fabricated balances on a real account. Matches the documented run workflow (both vars set together). - Hot-path probe: parse_credits_headers checks for the version sentinel header before allocating a lowercased copy of the response headers — skips that work on every non-Nous API call. Behaviour-identical and still case-insensitive. - Fire-and-forget seed: the real portal fetch in seed_credits_at_session_start now runs in a daemon thread, so a slow/unreachable portal never delays session "ready" (previously blocked up to 10s). The dev-fixture path stays synchronous; the thread re-checks idempotency before hydrating (a live header may land first). - Diagnostics: debug breadcrumbs on the parse and seed fail-open paths so a crashed parser / dead seed is distinguishable from a legitimate no-headers miss. Cold-start tests set HERMES_DEV_CREDITS alongside the fixture to match the gate. * test(tui): fix env-timing in the StatusRule dev-credits assertion DEV_CREDITS_MODE is read once at module load (config/env), so mutating process.env.HERMES_DEV_CREDITS inside the test couldn't flip it — the dev-banner assertion only passed if the env was exported before vitest started, and failed in a normal run. Move that assertion to a sibling file that mocks config/env with DEV_CREDITS_MODE: true (scoped, no module-reset / React-identity hazard). * test(credits): cover the dev-fixture /usage render and usage-band clear-on-prompt - _snapshot_from_credits_state (the offline /usage renderer) had no direct test: lock the gauge math, the verbatim _usd magnitudes, the depletion line and the fixture marker, plus the no-cap (no gauge) and None-state cases. - turnController.startMessage had no test for clearing the credits.usage notice on the next prompt while leaving credits.depleted sticky. feat(credits): deliver credit notices over messaging gateways Bind notice_callback/notice_clear_callback on the per-turn gateway agent so usage-band / depletion / restored notices reach Telegram/Discord/Slack/ etc. Previously the messaging gateway bound neither callback, so the agent's _emit_credits_notices early-returned and a chat user crossing a band got nothing unless they ran /usage manually. - render_notice_line(): AgentNotice -> single plaintext line (level glyph + text), plaintext-only so it renders uniformly without per-platform escaping. Fail-soft on malformed/empty notices. - Standalone push for every notice (messaging has no persistent status bar): route through the shared _deliver_platform_notice rail (honors private/ public delivery + thread metadata), scheduled onto the gateway loop via safe_schedule_threadsafe from the agent's sync worker thread — same pattern as _status_callback_sync. - The fired-once latch lives on the cached (reused-in-place) agent and persists across turns, so a band crosses once -> one push, no per-turn re-nag. Re-fires only after idle-eviction rebuilds the agent (a reminder). - Recovery ('Credit access restored') rides the show path (emitted as a success notice, not a clear). notice_clear_callback is a no-op: a sent platform message can't be cleanly retracted. Tests: render glyph/levels/fail-soft + public/private delivery seam through _deliver_platform_notice + no-adapter no-op. * fix(credits): don't double the glyph on messaging notices render_notice_line prepended a per-level glyph, but the notice policy already bakes the glyph into the text (and the TUI + CLI render it verbatim) — so every credit notice over messaging came out doubled ("⚠ ⚠ Credits 90% used", "⛔ ✕ Credit access paused"). Emit the text verbatim instead; drop the now-dead level→glyph map. The render tests fed glyph-less text (and the success case only checked startswith), so the doubling slipped through. Rework them around the verbatim contract and add an end-to-end regression that runs real evaluate_credits_notices output through render_notice_line and asserts the line is returned unchanged.	2026-06-06 13:18:18 +05:30
Brooklyn Nicholson	9c1bb8d2c7	Add /version slash command across CLI, gateway, TUI, and desktop. Surfaces Hermes Agent version info on demand without leaving chat; works mid-run like /help and /update.	2026-06-05 18:05:05 -07:00
kshitij	66a6b9c930	Merge pull request #39482 from liuhao1024/fix/rich-markup-error-on-session-resume fix(cli): use Rich [dim] tag instead of ANSI escape in session resume messages	2026-06-05 13:12:17 -07:00
ViewWay	1c909e75e1	fix(cli,gateway): complete max_tokens propagation — CLI path + env var override Previous commit only covered the gateway runtime path. This adds: - CLI __init__: read max_tokens from model config with HERMES_MAX_TOKENS env override - CLI AIAgent() calls (interactive + background): pass max_tokens - Gateway _resolve_runtime_agent_kwargs: add HERMES_MAX_TOKENS env override All three code paths (CLI, gateway runtime, session override) now consistently propagate max_tokens to AIAgent.	2026-06-05 09:10:26 -07:00
Teknium	9ca11b35d5	perf(/model): prewarm picker provider-models cache in background (#39847 ) * fix: respect disabled auto-compaction on context overflow Port from anomalyco/opencode#30749. When compression.enabled is false, NO automatic compaction trigger may fire. The proactive token-threshold paths (preflight + post-response should_compress gate) already honoured the setting, but the three provider-overflow recovery paths in the agent loop — long-context-tier 429, 413 payload-too-large, and context-overflow — called _compress_context() unconditionally, silently compressing and rotating the session against the user's explicit choice. Add a single guard at the top of the overflow-recovery dispatch: when compression is disabled and the error is one of those three overflow classes, surface a terminal error (compaction_disabled: True) telling the user to /compress manually, /new, switch to a larger-context model, or reduce attachments. Manual /compress (force=True) is unaffected — it never enters this loop. Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't compress when disabled; control case still compresses when enabled). Existing overflow-recovery tests updated to enable compaction explicitly (they verify the recovery fires); fixture defaults flipped to True to match production (compression.enabled defaults to True). * perf(/model): prewarm picker provider-models cache in background The no-args /model picker calls list_authenticated_providers(), which fetches each authenticated provider's live /v1/models list serially. On a cold or stale (>1h TTL) cache that blocks ~1.5s on the user's critical path the first time /model is opened in a session. Warm that exact path off-thread during the idle window right after the CLI banner is shown: a once-per-process daemon thread runs list_authenticated_providers() to populate provider_models_cache.json for every authed provider. By the time the user types /model, the picker hits the warm disk cache (~136ms vs ~1500ms). Process-level Event guard (mirrors run_agent's _openrouter_prewarm_done) ensures at most one thread per process; fully exception-isolated so an offline/no-creds provider can never affect the session.	2026-06-05 06:55:09 -07:00
YapBi	1bcfe9c58a	fix(cli): widen _run_cleanup MCP shutdown guard to BaseException	2026-06-04 19:40:46 -07:00
liuhao1024	391b594752	fix(cli): use Rich [dim] tag instead of ANSI escape in _restore_session_cwd Replace [{_DIM}] with [dim] in all _restore_session_cwd and _preload_resumed_session messages that go through _console_print (Rich Console.print). _DIM is an ANSI escape (\x1b[2;3m) that Rich cannot parse as a markup tag, causing MarkupError on session resume when the stored cwd is missing or inaccessible. Also uses [/dim] closing tag for explicit tag matching. Fixes #39469	2026-06-05 10:00:21 +08:00
liuhao1024	a3fb48b2ce	fix(state): keep /branch sessions visible after parent reopen /branch (aka /fork) sessions vanished from /resume and /sessions. Both surfaces funnel through list_sessions_rich(include_children=False), which hid any session with a parent_session_id unless identified as a branch via a heuristic — parent.end_reason == 'branched' AND child.started_at >= parent.ended_at. Two ways that heuristic failed: 1. CLI/gateway branches: once the parent was reopened (e.g. resumed) and re-ended with a different end_reason (tui_shutdown overwriting 'branched'), the heuristic stopped matching and the branch was hidden permanently. 2. TUI branches (tui_gateway session.branch): the TUI never ends the parent as 'branched' — it creates the child while the parent is still live — so the heuristic NEVER matched and TUI branches were hidden from the moment they were created (this is the macOS desktop app's primary symptom). Fix: persist a stable '_branched_from' marker in the branch session's model_config at creation time across ALL THREE branch paths (CLI cli.py, gateway gateway/run.py, and TUI tui_gateway/server.py), and OR a json_extract(model_config, '$._branched_from') IS NOT NULL check into the list_sessions_rich filter. The marker is immutable across the parent's lifecycle, so the branch stays visible regardless of how/whether the parent is ended. The legacy end_reason heuristic is kept (OR'd) so pre-existing branches remain visible. Subagent/compression children (no marker, parent not 'branched') stay correctly hidden. Fixes #20856. Approach by liuhao1024 (PR #20864); reimplemented on current main, extended to the TUI branch path (which the original missed), with regression tests for the reopen+re-end scenario and the TUI marker persistence.	2026-06-04 10:07:20 -07:00
Teknium	d3fab54933	fix(cli): clear screen on exit so live chrome isn't stranded in scrollback (#38928 ) The classic CLI left its live bottom chrome — the status bar, input box, and separator rules — frozen in terminal scrollback after exit, on every exit path (/exit, /quit, Ctrl+C, EOF) and on both Linux and Windows. The prior erase_when_done=True fix (`bf82a7f1c`) routes prompt_toolkit's teardown through renderer.erase(), but that walks back by the renderer's internal cursor model and does not reliably wipe the chrome in practice — users still saw a dead status bar + the rest of the session sitting above the resume summary. Clear the screen + scrollback directly at the single exit funnel instead. All exit paths converge on _print_exit_summary() (called from the run-loop finally block after app.run() returns and prompt_toolkit has restored terminal modes), so a new _clear_terminal_on_exit() helper runs there before the summary prints. It writes ESC[3J ESC[2J ESC[H (erase scrollback, erase screen, home cursor) on a real TTY, no-ops silently when stdout is not a terminal (pipes/redirects), and falls back to the platform clear command if the escape write fails. Works on Linux, macOS, and modern Windows terminals (Terminal/conhost with VT processing, already enabled by prompt_toolkit). The resume/goodbye summary now prints at a clean top-left with nothing stranded above it. Fixes #38252.	2026-06-04 04:38:35 -07:00
Teknium	bf82a7f1cc	fix(cli): erase live chrome on exit so it isn't stranded above the session summary Sets erase_when_done=True on the classic CLI's prompt_toolkit Application so the live bottom chrome (status bar, input box, separator rules) is wiped on exit instead of frozen into scrollback. Previously prompt_toolkit's render_as_done teardown repainted the chrome one final time and left it on screen (ESC[J only erases below the cursor, not the chrome above), so a dead status bar + empty prompt + rules were stranded between the conversation transcript and the 'Resume this session' summary, and stacked with the next session's UI on resume. erase_when_done routes teardown through renderer.erase() which wipes exactly the managed chrome region; the conversation transcript prints through patch_stdout into normal scrollback and is untouched. Applies to every exit path (/exit, /quit, EOF, Ctrl+C). Fixes #38252.	2026-06-04 03:03:23 -07:00
helix4u	ffb53767bf	fix(config): align prefill messages key handling	2026-06-03 23:51:44 -06:00

1 2 3 4 5 ...

833 commits