hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-29 11:42:04 +00:00

Author	SHA1	Message	Date
longer	6d9ca04574	fix(desktop): resume latest compression continuation	2026-06-25 16:29:09 -07:00
brooklyn!	ffa3d3c811	Merge pull request #49037 from NousResearch/bb/projects-paradigm feat(desktop): first-class projects — sidebar, coding rail, review pane, and agent project tools	2026-06-25 17:49:05 -05:00
Teknium	fd2a35b169	fix: stop reporting cache-hit rate and cost across all UI surfaces (#52717 ) * fix: stop reporting cache-hit rate and cost across all UI surfaces Cost estimates and cache read/write token reporting are unreliable on providers that don't surface cached_tokens (e.g. ollama-cloud, which doesn't implement prompt_tokens_details.cached_tokens), producing misleading near-zero 'cache hit' readouts and cost figures. Remove cost + cache-hit reporting from every user-facing surface; keep input/output/total token counts (provider-agnostic and accurate) and the Nous account billing UI (real account money, separate from per-conversation estimates). Surfaces: - CLI /usage + model-info: drop cost lines + cache read/write token lines - Gateway /usage + /model: drop cost + cache lines - tui_gateway/server.py: stop emitting cost_usd / cache_read in usage and subagent.complete payloads - TUI (Ink): drop cost from status bar (+ showCost plumbing), /usage panel, thinking rollup, agents overlay (incl. compare view); keep token counts - Desktop Command Center: drop cost stat, per-model cost, actual-cost hint Underlying estimate_usage_cost / format_cost / insights cost columns are left intact but no longer surfaced (display-only change, reversible). * test: update TUI + gateway + CLI tests for removed cost/cache-hit reporting - CLI /usage test asserts cost/cache lines are absent, tokens present - gateway /usage test drops cost + cache asserts; removes cost-included test - TUI subagentTree summary expectation drops the cost segment - useConfigSync + appChrome status-rule tests drop showCost prop/state	2026-06-25 15:21:22 -07:00
Brooklyn Nicholson	4e023f5bc9	feat(gateway): build authoritative project tree	2026-06-25 16:40:27 -05:00
Teknium	c6575df927	feat(moa): expose MoA presets as selectable virtual models (#46081 ) * feat(moa): expose MoA presets as selectable virtual models Reconstructed onto current main (PR #46081's base had diverged with no common ancestor, marking the PR dirty so CI never dispatched). MoA is now a virtual provider: each named preset is a selectable model under provider 'moa', and the preset's aggregator is the acting model that answers and calls tools. Reference models fan out in parallel via a bounded ThreadPoolExecutor (the same batch pattern delegate_task uses) — all references dispatched at once, collected when every one finishes, then handed to the aggregator. Output order is preserved, failures and the MoA-recursion guard stay isolated per reference. - Removed the old mixture_of_agents model tool and moa toolset. - Added moa as a virtual provider in the provider/model inventory. - /moa is shortcut behavior over model selection (default preset / named preset / one-shot prompt). - Dashboard + Desktop manage named presets; presets appear in model pickers. - Parallel reference fan-out in agent/moa_loop.py with regression test. * fix(moa): thread moa_config through _run_agent to _run_agent_inner The reconstructed gateway MoA wiring declared moa_config on _run_agent (the profile-scoping wrapper) and used it inside _run_agent_inner, but the wrapper never forwarded it — _run_agent_inner had no such parameter, so the runtime hit NameError: name 'moa_config' is not defined on the compression-failure session sync path. Add moa_config to _run_agent_inner's signature and forward it from both wrapper call sites (multiplex and non-multiplex). Caught by tests/gateway/test_compression_failure_session_sync.py on CI shard test(4). * fix(moa): classify moa as a virtual provider in the catalog The moa virtual provider has no PROVIDER_REGISTRY/ProviderProfile entry, so provider_catalog() fell through to the default auth_type="api_key" with no env vars — tripping two catalog invariants: - test_provider_catalog: api_key providers must expose a credential env var - test_provider_parity: every hermes-model provider must be desktop-configurable moa already declares auth_type="virtual" in HERMES_OVERLAYS; consult that overlay as an auth_type fallback so the catalog reports moa as virtual (no real credential, no network endpoint). Exempt virtual providers from the desktop parity union check the same way 'custom' is exempt — derived from the catalog, not a hardcoded slug, so future virtual providers are covered too.	2026-06-25 13:52:06 -07:00
Brooklyn Nicholson	1ca1f9f2c7	refactor(tui_gateway): DRY the deferred-session paths Collapse the duplicated cold-resume / lazy-watch / create scaffolding into shared helpers: _deferred_session_record (the live-session dict minus the agent), _lazy_resume_info (the not-yet-built session.info), _claim_or_reuse_live (lock + double-checked register-or-reuse), and _schedule_agent_build (the pre-warm timer). Net -12 lines, three copies of the ~30-key session dict and the lazy-info block down to one each. No behavior change.	2026-06-25 14:03:03 -05:00
Brooklyn Nicholson	3bf00e459a	perf(desktop): make deferred resume the default, not an opt-in flag Per review: gating the faster path behind a `defer_build` flag that the only caller always sends is pointless. Flip it — `session.resume` now defers the agent build by default for every caller (desktop + Ink TUI); a caller that needs the agent built synchronously passes `eager_build: true` (used by the build-race test). The desktop no longer sends a flag. While verifying the flip, fixed two real parity gaps the deferred path had vs the old eager (`_init_session`) path: - `_enable_gateway_prompts()` was never called on a deferred resume, so approvals/clarify wouldn't route through the gateway prompt callbacks. - `_start_agent_build` never wired `background_review_callback` / `memory_notifications`, so a deferred-built session's self-improvement "💾 …" summary leaked to stdout instead of rendering in-transcript. Wiring it there also fixes it for `session.create` sessions, which build through the same path. ACP is unaffected (it uses its own session_manager, not this RPC); the Ink TUI already consumes the same lazy `info` shape from session.create and upgrades on the later `session.info` event.	2026-06-25 14:03:03 -05:00
Brooklyn Nicholson	c4c590e4a1	perf(desktop): make session switching fast under load Switching sessions in the desktop app could freeze the whole UI for several seconds on heavy, tool-rich chats. Root causes and fixes: - Cold `session.resume` built the AIAgent (MCP discovery, prompt/skill build) before returning, and the desktop awaits that RPC before it paints — so the entire switch blocked on the build. Add an opt-in `defer_build` resume path (the contract `session.create` already uses): return the full display transcript immediately, register an upgradable live session, and pre-warm the agent on a short timer. The persisted runtime identity (model/provider/base_url/api_mode/reasoning/tier) is restored on the deferred build so it can't drop the provider. - Nothing bounded how many in-memory agents accumulate; a user who reconnects often piled up detached sessions for the full 6h TTL. Add a soft LRU cap (`max_live_sessions`, default 16) that evicts the least-recently-active DETACHED sessions (no live client) — never a running, awaiting-input, mid-build, or live-transport one. Reopening re-resumes from disk. - On the prefetch-hit cold-resume path, skip rebuilding a throwaway merged-message array (and its 1000-entry Map) when the prefetch already painted the exact transcript; the downstream sameMessageList guard already drops the publish, so it was pure main-thread cost. The desktop opts into `defer_build` for every non-watch cold resume; the eager path stays for CLI/TUI and existing callers.	2026-06-25 14:03:03 -05:00
kshitij	c210e23a02	Merge pull request #52386 from NousResearch/salvage/31999-yaml-indent fix(utils): unify YAML list indent across all config writers (#31999)	2026-06-25 23:39:37 +05:30
xxxigm	0aea0c3654	fix(utils): unify YAML list indent across all config writers (#31999 ) atomic_yaml_write used default yaml.dump which emits indentless sequences (list items at column 0), while atomic_roundtrip_yaml_update (ruamel.yaml) emits 2-space-indented sequences. Cross-path writes to the same config.yaml toggled indentation on every save, eventually producing a mixed-indent file that js-yaml rejects with 'bad indentation of a mapping entry', silently dropping custom_providers and breaking model switching. Add IndentDumper SafeDumper subclass that forces indentless=False, route atomic_yaml_write through it. Route tui_gateway._save_cfg and the Telegram adapter's config writer through atomic_yaml_write so all paths emit the same 2-indent layout. Salvaged from #32034 by @xxxigm. Adapted to current main which already has allow_unicode=True (from #51356) but was missing IndentDumper. Closes #31999	2026-06-25 23:27:44 +05:30
Brooklyn Nicholson	70319626a9	fix(tui_gateway): queue mid-turn prompts instead of dropping them on a busy retry A prompt sent while a turn was in flight got rejected with 4009 "session busy", which pushed clients (the desktop app) into a deadline-bounded busy-retry. When turn teardown outlived that deadline — e.g. the user hits stop while a slow, non-interruptible tool (web_search, read_file, an MCP call) is mid-flight, since the sequential executor only checks the interrupt flag between tools — the resubmitted message was silently dropped: "it just doesn't listen". Wire the previously-dead display.busy_input_mode config into prompt.submit: instead of rejecting, apply the policy and queue the message to run as the next turn (drained in run()'s tail, ahead of goal/notification follow-ups). Modes: interrupt (default) interrupts the live turn so it winds down promptly then runs the queued message; queue runs it after the current turn finishes; steer injects it into the live turn when accepted, else queues. The queued slot pins the sender's transport and losslessly merges a second arrival. No client deadline, no dropped sends.	2026-06-25 12:29:49 -05:00
Brooklyn Nicholson	7ef0f360d0	feat(gateway): expose coding verification status Add a read-only gateway RPC for querying the passive verification ledger without running checks from the UI surface.	2026-06-24 22:36:03 -05:00
Teknium	a4fa1481e2	fix(tui): route /learn through command.dispatch so the prompt fires (#52232 ) The Desktop GUI (tui_gateway) slash worker subprocess has no reader for the CLI's _pending_input queue. /learn's CLI handler prints the ack and puts the built prompt onto that queue, so in the TUI the prompt was silently dropped — ack shown, no LLM turn, no skill created (#51829). command.dispatch already handles 'learn' correctly (returns {type: send, message: build_learn_prompt(arg)}), but 'learn' was missing from _PENDING_INPUT_COMMANDS, so slash.exec fell through to the worker instead of routing to command.dispatch. Add it to the frozenset, matching the existing goal/queue/steer/plan pattern.	2026-06-24 18:48:50 -07:00
Brooklyn Nicholson	1fe013ee16	feat(pets): polish generate flow and reduce hatch CPU pressure Ship the final pet-generation UX polish (provider picker behavior, step-2 cancel flow, banner integration, and visual consistency) and make saturated-chroma background removal C-op driven so hatch processing no longer hammers the machine during long runs.	2026-06-24 19:08:06 -05:00
Brooklyn Nicholson	b674f7ba28	feat(pets): offer backend setup when generation is unavailable When no reference-capable image backend is configured, generating a pet is impossible — so instead of a dead prompt + post-hoc error, the overlay now detects it up front and offers a way out: - pet.generate.status RPC reports whether a reference-capable provider (OpenRouter / Nous Portal / OpenAI) is set up; the overlay probes it on open and swaps the prompt for a friendly setup card (paw, one-line copy, "Set up image generation" → /settings?tab=providers, key links). - useRouteOverlayActive(): reusable hook so any portaled modal yields the screen to a full-screen route overlay (e.g. settings) and reappears — re-running its mount effects — on return, instead of closing. The probe re-runs on that remount, so adding a key flips the card to the prompt.	2026-06-24 14:10:19 -05:00
Brooklyn Nicholson	aab49f6927	feat(pets): generation RPCs, non-blocking gallery + gateway plumbing - pet.generate / pet.hatch (parallel rows, off the reader thread) + cooperative pet.cancel; pet.export / pet.rename. - pet.gallery localOnly fast path + background manifest prefetch so the picker never blocks on petdex; rename follows the active-pet config. - gateway request gains optional timeout + AbortSignal for real Stop.	2026-06-24 13:48:38 -05:00
brooklyn!	35e9c63d89	Merge pull request #52008 from infinitycrew39/fix/desktop-nous-onboarding-stale-provider fix(desktop): stop Nous Portal onboarding from validating stale Anthropic config	2026-06-24 13:12:44 -05:00
infinitycrew39	6da615c77c	fix(desktop): scope onboarding runtime check to connected provider Let setup.runtime_check accept an optional provider, persist the selected provider/model before the gate, and validate the provider the user just connected instead of a stale config entry such as anthropic.	2026-06-24 23:19:45 +07:00
kyssta-exe	b85c460540	fix(tui): targeted save_config_value for model persistence (#48305 ) The TUI model-switch persistence (_persist_model_switch) rewrote the entire model config block via save_config(), destroying sibling keys the user set under model: (model_slots, model_fallback, base_url, ...) on every switch. Use targeted, atomic, comment-preserving save_config_value("model.default" / "model.provider" / "model.base_url") writes instead, so a model switch only touches the keys it changes. Salvaged from #48391 by kyssta-exe (authorship preserved). Fixes #48305	2026-06-24 19:34:33 +05:30
Teknium	d539cd9004	fix(config): write config.yaml as UTF-8 to stop emoji/personality corruption (#51676 ) atomic_yaml_write (and two sibling config writers) called yaml.dump without allow_unicode=True. The default personalities shipped in cli.py contain emoji/kaomoji, so PyYAML escaped astral-plane chars as 8-digit \\UXXXXXXXX sequences inside multi-line double-quoted strings wrapped with \\ line-continuations. Stricter/non-PyYAML parsers, editors, and hand-edits break that structure into unclosed quotes, failing the whole config parse -> silent fallback to defaults -> custom_providers lost. Add allow_unicode=True to the canonical writer plus tui_gateway/server.py and the telegram adapter's atomic config write so config is written as readable UTF-8 with no escape/fold artifacts. Fixes #51356	2026-06-23 23:28:21 -07:00
Brooklyn Nicholson	e495b33bf1	Merge remote-tracking branch 'origin/main' into bb/pets-merge # Conflicts: # hermes_cli/commands.py # tui_gateway/server.py	2026-06-23 19:05:22 -05:00
Teknium	e32ebc6aa2	feat(skills): /learn — distill a reusable skill from anything you describe (#51506 ) Open-ended skill learning across every surface. /learn <free text> takes a description of any source — a directory, a URL, the workflow you just walked the agent through, or pasted notes — and the live agent gathers it with the tools it already has (read_file/search_files, web_extract, the conversation, the pasted text), then authors a SKILL.md via skill_manage following the house authoring standards (<=60-char description, the standard section order, Hermes-tool framing, no invented commands). No engine, no model-tool footprint, works on any terminal backend (local, Docker, remote): /learn builds a standards-guided prompt and hands it to the agent as a normal turn. - agent/learn_prompt.py: shared standards-guided prompt builder - /learn registry entry (both surfaces) + CLI handler (inject onto input queue) + gateway handler (rewrite turn, fall through, /blueprint pattern) - tui_gateway command.dispatch returns a send directive -> TUI + dashboard chat - dashboard Skills page 'Learn a skill' panel (dir + URL + open-ended text) composes a /learn request and runs it in chat - docs (slash-commands ref + skills feature page), 11 targeted tests Inspired by OpenAI Codex's Record & Replay and the /learn concept from #47234 (dir-distillation engine); reworked to be open-ended and engine-free per review.	2026-06-23 13:51:28 -07:00
konsisumer	02050859f3	fix(tui): preserve live session identity across compression (#49041 ) When a session rotates id on compression, _sync_session_key_after_compress() re-anchored the session_key, approval-notify routing, yolo state, and slash worker — but never moved the active-session lease, which stayed keyed to the pre-compression id. And _find_live_session_by_key() matched live sessions on the stale session_key, not the live agent's current agent.session_id. After compression a resume/create path failed to recognize the existing live agent and could build a SECOND live agent against the same DB continuation -> forked lineage / cross-session message mixing. - active_sessions.transfer_active_session(): move a lease in place to the new id under the exclusive file lock (no slot drop). - gateway _transfer_active_session_slot(): call it inside _sync_session_key_after_compress(); on the rare fallback (entry pruned) RESERVE the new slot before releasing the old lease (reserve-before-release), so a concurrent gateway at the session cap cannot grab the freed slot in a release-then-reacquire window and leave this session with no lease; if the reserve fails, keep the existing lease (review fix). - _session_lookup_key(): make live-session lookup authoritative on agent.session_id, wired into all stale-session_key consumers (_find_live_session_by_key, _session_live_item, _live_session_payload) — fixes the whole lookup class. Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-06-24 00:54:18 +05:30
Teknium	72bfc48e63	feat(tui): track background subagents in the status bar (#51485 ) Parity with the classic CLI status bar's ⛓ indicator (PR #51441). The Ink TUI status bar now shows ⛓ N for live background/async subagents (delegate_task batches + background single delegations). - tui_gateway/server.py: _get_usage() embeds active_subagents from tools.async_delegation.active_count() — the same registry the CLI reads — onto the existing per-update usage payload, guarded so a raising active_count() leaves the field off without breaking usage. - ui-tui appChrome: new 'subagents' status segment (breakpoint w>=92, slots between bg and cost in the shed-order), renders ⛓ N from usage.active_subagents. - Usage / SessionUsageResponse types gain active_subagents?. Distinct from the turn-scoped SpawnHud / /agents overlay, which mirror live in-turn subagent.* events; this is the persistent registry count.	2026-06-23 11:32:00 -07:00
brooklyn!	211ba9c7d3	feat(agent): one-shot LLM helper + llm.oneshot gateway RPC (#51261 ) A "one-shot" is a single stateless model call that runs OUTSIDE any conversation: it never touches session history, never breaks prompt caching, and returns plain text. UI surfaces need this for small generative chores — a commit message from a diff, a rename suggestion, a summary — where an agent turn would pollute the thread and hand-rolling an LLM call at every call site would be worse. - `agent/oneshot.py`: `run_oneshot(...)` over the existing auxiliary-client plumbing (same path as title generation). Two call shapes: explicit instructions/input, or a registered `template` + `variables` (templates own the prompt engineering so it stays consistent across CLI/TUI/desktop). Ships a `commit_message` template. Model selection inherits the live session via `main_runtime`, else the configured aux `task` backend. - `tui_gateway/server.py`: `llm.oneshot` RPC (long-handler) inheriting the session's model when `session_id` resolves. Stateless by construction — no session mutation, cache untouched.	2026-06-23 08:01:50 +00:00
brooklyn!	af7b7f6322	feat(agent): expose coding-context project facts as structured data + project.facts RPC (#51259 ) Follow-up to the coding-context posture (#43316): that PR detects each repo's verify loop (manifests, package manager, exact test/lint/build commands, context files) and bakes it into the system-prompt snapshot — but only as a string, for the model. Non-prompt consumers (the desktop verify UI) had no way to read it without re-sniffing and drifting from the prompt. Split detection from rendering, keeping one source of truth: - `detect_project_facts(root) -> ProjectFacts` (frozen) holds the structured facts; `_project_facts()` now renders it into the same snapshot lines, so the prompt block stays byte-identical (cache-safe). - `project_facts_for(cwd)` resolves the workspace root (git, else marker) and returns the structured facts, or None outside a workspace. - `project.facts` gateway RPC surfaces it to any client (desktop/TUI/ACP). Tests assert the structured output and that the UI-facing commands never drift from what the prompt block renders (one detector feeds both).	2026-06-23 08:00:01 +00:00
kshitijk4poor	c080b2dc3e	fix(gateway): redact credentials from TUI approval prompts (#48456 ) Follow-up to #50767, which redacted the chat-platform (_approval_notify_sync) and SSE/API (_approval_notify) approval transports. The TUI JSON-RPC transport is the third egress and was missed: three register_gateway_notify callbacks in tui_gateway/server.py emitted the raw approval_data — including the unredacted command Tirith flagged — straight to the TUI client via _emit. Route all three registrations through a new module-level _emit_approval_request() helper that redacts payload['command'] via the shared gateway.run._redact_approval_command seam before emitting, matching the pattern used for the other two transports. Completes the whole-bug-class fix for #48456. Tests: assert the helper emits a redacted command (real credential pattern), handles missing/None command, and a wiring guard that no registration emits the raw payload directly (only the helper may). Both mutation-checked. The #48456 fix series originated from @liuhao1024's #48462 — credit to them for the original report and chat-platform fix; this completes the remaining transport. Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>	2026-06-23 03:14:18 +05:30
Teknium	ff85af3fc7	feat(goals): /goal wait <pid> — park the loop on a background process (#50503 ) * feat(goals): add /goal wait <pid> barrier to park the loop on a background process The /goal loop re-pokes the agent every turn via the post-turn judge. When a goal is gated on a long-running background process (CI poller, build, test matrix, deploy) that produces nothing to judge yet, this spins the agent into 'is it done?' busy-work and burns the turn budget. /goal wait <pid> [reason] parks the loop: while the PID is alive, the judge is skipped, no turn is consumed, no continuation fires, and /goal status shows a parked indicator. The barrier auto-clears the moment the process exits (the agent's notify_on_complete watcher is the natural wake signal), then the next turn resumes normal judging. /goal unwait clears it manually; pause/resume/clear drop it; a dead/stale PID can never wedge the loop. Wired across CLI, gateway, and the mid-run command guard for parity. Barrier persists in SessionDB.state_meta (survives /resume); GoalState gains backward-compatible waiting_on_pid/waiting_reason/waiting_since fields. 12 new tests; docs updated. * fix(goals): use gateway.status._pid_exists for liveness, not os.kill(pid,0) The Windows-footguns CI guard flagged os.kill(pid, 0) in _pid_alive — on Windows that's not a no-op, it routes to CTRL_C_EVENT and hard-kills the target's console process group (bpo-14484). Delegate to the canonical footgun-safe gateway.status._pid_exists (psutil + ctypes/POSIX fallback) instead, with a direct-psutil last resort. * feat(goals): judge-driven auto-wait — the loop parks itself, no manual /goal wait Makes the wait barrier automatic. Every turn the judge is shown the agent's live background processes (pid, command, uptime, output tail from the process_registry) alongside the goal + response, and can return a new 'wait' verdict instead of continue: {"verdict":"wait","wait_on_pid":N} → park until that process exits {"verdict":"wait","wait_for_seconds":N} → park until the deadline passes evaluate_after_turn acts on the directive (sets the barrier, parks the loop) so the agent isn't re-poked into busy-work while CI/builds/deploys run. Adds a time-based waiting_until barrier alongside the pid barrier; both auto-clear and can never wedge the loop. Drivers (CLI, gateway, tui_gateway) feed the live registry in via gather_background_processes(). Manual /goal wait stays as an override. Judge verdict contract widened to (verdict, reason, parse_failed, wait_directive); legacy {"done":bool} shape still accepted. * test(goals): update kanban _fake_judge to the 4-tuple judge contract CI test(3) caught it: test_kanban_goal_mode's _fake_judge still returned the 3-tuple (verdict, reason, parse_failed), but the kanban loop now unpacks the 4-tuple (+ wait_directive). Update the fake to return None for the directive and accept the background_processes kwarg. * feat(goals): trigger-based wait — park on a process's own signal, not just exit Addresses two gaps in the judge-driven wait: (1) the judge could only express 'wait until PID exits' or 'wait N seconds', so a long-lived watcher/server that fires a trigger MID-RUN (and may never exit) couldn't be waited on; (2) the process's own watch_patterns/notify_on_complete trigger was invisible to the judge. Adds a session-based barrier (waiting_on_session) that releases on the process's OWN trigger via process_registry.is_session_waiting(): the session exits, OR (if started with watch_patterns) its pattern matches — even while the process keeps running. list_sessions() now surfaces session_id + watch_patterns/watch_hit/ notify_on_complete so the judge sees the trigger and is told to prefer wait_on_session for trigger processes. Judge verdict gains a {wait_on_session} directive (preferred over pid). Backward-compatible GoalState field; pid + time barriers unchanged. Tests: TestSessionTriggerBarrier (release on mid-run pattern match while alive, release on exit, unknown-session, full park→trigger→resume, parse, validation, backcompat load). 105 goal-surface + 85 process_registry tests green.	2026-06-22 06:27:29 -07:00
Brooklyn Nicholson	5342eccf12	Merge remote-tracking branch 'origin/main' into bb/pets	2026-06-22 05:25:49 -05:00
Shannon Sands	b9b4756ab4	fix dashboard chat session titles	2026-06-21 22:44:02 -07:00
Teknium	95d53c3bcb	feat(cli): /reasoning full — show complete thinking, not 10-line clamp (#50499 ) * feat(cli): /reasoning full to show complete thinking, not 10-line clamp The post-response Reasoning recap box hard-clamped long thinking to the first 10 lines, so there was no way to see the full reasoning trace after a turn (live streaming already shows it in full). Add display.reasoning_full (default off) plus /reasoning full\|clamp to toggle it at runtime; the clamp truncation note now points at the command. Addresses repeated user requests to show all thinking tokens. * test(gateway): de-snapshot /reasoning help assertion The test froze the exact args-hint literal '/reasoning [level\|show\|hide]', which the new full/clamp args change to '[level\|show\|hide\|full\|clamp]'. Convert to an invariant: assert /reasoning is in help and carries its core args, not the exact hint string. * feat(tui): /reasoning full\|clamp parity in tui_gateway The classic-CLI reasoning_full toggle had no TUI equivalent — typing /reasoning full in the TUI fell through to parse_reasoning_effort and errored. The TUI renders thinking as an expand/collapse section (no fixed 10-line recap), so map full -> sections.thinking=expanded (raw, uncapped via thinkingPreview mode='full') and clamp -> collapsed, persisting display.reasoning_full for cross-surface config consistency.	2026-06-21 20:21:11 -07:00
Teknium	99f3072aa0	fix(model-switch): a failed in-place swap must be a no-op, not a dead session (#50375 ) When a /model switch resolves a valid model but the in-place agent swap fails mid-conversation (expired key, unreachable base_url), the agent rolls itself back to the old working model+client and re-raises. The callers caught that re-raise, logged a warning, then committed the broken switch anyway: wrote the failed model to the session DB, set _session_model_overrides to the broken model/provider/key, and (gateway direct path) evicted the working cached agent. The next message then rebuilt a dead agent from the broken override -> permanently unusable conversation (#50163). Fix the whole caller class so a failed swap aborts the commit entirely: - gateway/slash_commands.py (picker + direct /model paths): on swap failure, early-return an error message; skip DB persist, session override, cache eviction, and config write. - cli.py (both /model handlers): snapshot CLI-level credential/runtime fields before mutating, restore them on swap failure, and abort the note + success print. - tui_gateway/server.py: wrap the previously-unguarded swap; on failure raise a clean error and skip worker restart, runtime persist, switch marker, session model_override, and config persist. The no-cached-agent path (apply-on-next-session) is unaffected. Adds a gateway regression test that fails on the pre-fix behavior.	2026-06-21 13:33:23 -07:00
teknium1	d0de4601d2	fix(tui): /compress shows a before/after summary (#46686 ) The TUI /compress slash side-effect compressed the session, synced the key, and emitted session.info — but returned an empty string, so the user saw no 'Compressed: N → M messages / ~X → ~Y tokens' feedback. The CLI (_manual_compress) and gateway (slash_commands) paths both already call summarize_manual_compression; the TUI slash path was the lone gap. Snapshot history + rough token estimate before and after compaction and return the formatted summarize_manual_compression() feedback, mirroring the session.compress RPC handler. The estimate uses the same estimate_request_tokens_rough(system_prompt, tools) inputs as the RPC path, re-reading the system prompt after compaction (it may be rebuilt). Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>	2026-06-21 11:36:09 -07:00
bogerman1	c7e8854cb3	fix(tui): persist session messages on force-quit / signal shutdown Mirror the CLI's exit-path behaviour in the TUI gateway so that unpersisted conversation messages are flushed to state.db and the on_session_end plugin hook fires before the session is closed. Root cause: _finalize_session() only called db.end_session() to mark the session row as ended, but did NOT flush in-memory messages via _persist_session() or fire the on_session_end hook. When the user force-quit (double Ctrl-C, terminal-close, SIGHUP) while the agent was mid-turn, messages accumulated since the last persist point were silently lost. Changes ------- tui_gateway/server.py - _finalize_session(): - Persist unflushed messages via agent._persist_session() before db.end_session(). Prefers agent._session_messages (set by the last _persist_session call inside run_conversation) over session['history'] (stale when agent is mid-turn). - Fire on_session_end(interrupted=True) plugin hook so crash- recovery plugins can flush buffers, matching cli.py behaviour. tui_gateway/entry.py - _log_signal(): - Explicitly call _shutdown_sessions() before sys.exit(0) in the SIGHUP/SIGTERM handler as belt-and-suspenders over atexit. tests/tui_gateway/test_finalize_session_persist.py (new): - 11 tests covering: history persistence, _session_messages priority, empty-history skip, missing-agent, double-finalize, persist-exception resilience, hook firing, hook-exception resilience, and db.end_session preservation. Related ------- Closes the TUI half of #5021 (CLI already handles this via its atexit handler). Also addresses the session-persistence gap discussed in #18465 and #18269.	2026-06-21 07:26:07 -07:00
kshitijk4poor	1ca29723f0	fix(cli): log instead of swallow preflight-warning errors; consistent TUI warning field Follow-up to the salvaged preflight-compression warning: - Replace silent `except Exception: pass` at all 5 guard call sites (cli.py x2, gateway/slash_commands.py x2, tui_gateway/server.py) with `logger.debug(...)` so signature drift in the guard helper isn't hidden. - tui_gateway/server.py: set the confirm dict's `warning` field to the merged message (was bare expensive-model text) so it matches `confirm_message` for any future consumer reading `warning`. - Add trailing newlines to the two new files.	2026-06-21 16:31:56 +05:30
Tuna Dev	04730f32e7	fix(cli): warn when in-session model switch will preflight-compress Adds hermes_cli/context_switch_guard.py mirroring the model_cost_guard pattern. When a user switches models mid-session (Herm TUI picker, CLI, or /model on Telegram/Discord), the warning surfaces on the existing ModelSwitchResult.warning_message path used by the expensive-model guard if the new model's compression threshold is below the current session size. Partial fix for #23767 — addresses only the 'user-facing guardrail when switching from a high-context provider to a substantially lower-context provider' slice. The other proposed fixes from that issue (hard preflight token guard, metadata cache invalidation on switch, compression safety invariant, oversized tool-output handling) are out of scope for this PR.	2026-06-21 16:29:31 +05:30
teknium1	98ecd0beeb	docs(mcp): fix stale ~0.75s discovery-wait reference in late-refresh docstring The MCP discovery wait is now bounded by the config-driven mcp_discovery_timeout (default 1.5s), not the old 0.75s flat value. Updates the _schedule_mcp_late_refresh docstring that still cited ~0.75s after #49208 made the bound configurable.	2026-06-20 23:23:47 -07:00
Brooklyn Nicholson	75b36a138f	feat(pets): TUI pet pane, picker + gateway RPCs Add the Ink pet sprite pane, the interactive /pet picker overlay, and live pet switching/rescale driven by new tui_gateway RPCs (pet state, pet.scale, per-state frames). Wires pet flash state and the picker into the TUI layout and slash handler. Covered by the slash-handler test.	2026-06-20 14:18:36 -05:00
Gille	a7983d5ad7	fix(dashboard): hide sidecar sessions from history (#49269 ) * fix(dashboard): hide sidecar sessions from history * test(dashboard): allow sidecar source in session payload	2026-06-19 18:06:38 -04:00
alt-glitch	b6e2a54a94	fix(mcp): address adversarial review round 1 (cache parity, gates, races) Consolidated findings from three independent reviewers (Codex, Claude Code, a Hermes subagent w/ the hermes-agent-dev skill): - BLOCKING: refresh_agent_mcp_tools rebuilt only the registry subset, silently dropping post-build-injected memory-provider (mem0/honcho/…) and context- engine (lcm_) tools on every refresh. Now additive-preserving: re-applies the same injectors agent_init uses, staged on locals and published atomically. - Re-injection now honors the #5544 enabled_toolsets gate for context-engine tools, so a restricted-toolset platform can't get lcm_ leaked back in. - Atomic read-diff-publish under one lock: the returned `added` set and the (tools, valid_tool_names) pair are consistent even under concurrent callers (no half-swap, no TOCTOU). - background_review fork opts out (_skip_mcp_refresh) so its byte-identical tools[] cache parity with the parent is preserved. - CLI /reload-mcp routed through the shared helper (was a 4th divergent copy with the same clobber bug + missing disabled_toolsets). - Explicit reloads (TUI RPC + CLI) pass enabled_override so a server the user just enabled in config this session is picked up; automatic paths reuse the agent's build-time selection. - mcp_discovery_timeout default 5.0 -> 1.5s: correctness now comes from the between-turns refresh, so the startup wait is only a small turn-1 UX bump rather than a heavy dead-server latency penalty. - has_registered_mcp_tools checks registered TOOLS (not connected servers) so a zero-tool/prompt-only server doesn't make the per-turn hook fire forever. - Tests: rewrote the thread-safety test to actually exercise the write path (alternating tool sets), added the #5544-gate regression, the memory/context preservation regression, and a "callable next turn via valid_tool_names" contract; removed a dead monkeypatch line.	2026-06-19 11:57:43 -07:00
alt-glitch	93d6e73028	fix(mcp): expose late-connecting MCP tools to the agent (TUI/CLI/gateway) MCP servers that connect after the agent's one-time tool snapshot were invisible for the whole session. Two root causes, fixed together: 1. The startup discovery wait was a flat 0.75s. HTTP/OAuth servers commonly take 2-6s on a cold connect, so they missed the window and their tools never entered the agent's snapshot. `thread.join(timeout)` already returns the instant discovery completes, so raising the bound costs ~0s for the common case (no MCP / fast servers) and only ever blocks for a genuinely-pending server, capped so a dead server can't freeze startup. The bound is now configurable via `mcp_discovery_timeout` (config.yaml, default 5.0s). 2. Three call sites duplicated the agent tool-snapshot rebuild (the TUI `reload.mcp` RPC, the gateway reload, and the TUI late-binding refresh thread), and the late-refresh detected changes by tool COUNT — missing an equal-size add/remove swap. Consolidated into one shared `tools.mcp_tool.refresh_agent_mcp_tools(agent)` helper that diffs by tool NAME, mutates the agent under a lock (thread-safe), and respects the agent's own enabled/disabled toolsets. The late-binding refresh keeps its pre-first-turn cache-safety guard: it never rebuilds the tool list once a turn has started, so the cached prompt prefix is never invalidated mid-conversation. Tests: new tests/tools/test_refresh_agent_mcp_tools.py covers the name-based diff, in-place mutation, agent-scoped filtering, thread safety, and the config-driven discovery bound (incl. instant-return when nothing is pending). 75 passed across the touched areas.	2026-06-19 11:57:43 -07:00
Ben	b0e47a98f9	fix(managed-scope): honor managed scope in all standalone config loaders The skin bug was one instance of a class: several subsystems build their config dict directly from config.yaml instead of routing through hermes_cli.config.load_config (which carries the managed merge), so they silently ignored administrator-pinned values. Audited every config.yaml reader and fixed the behavioral-read bypasses: - gateway/config.py load_gateway_config (messaging gateway: session_reset, quick_commands, stt, model, ...) - gateway/run.py _load_gateway_config (its read_raw_config fast path also skipped the merge — read_raw_config returns raw user YAML) - tui_gateway/server.py _load_cfg (new TUI + desktop backend: skin, reasoning_effort, service_tier, provider_routing) - cron/scheduler.py (scheduled-job model/reasoning/toolsets/provider_routing) - hermes_logging.py (logging.level/max_size_mb/backup_count) - hermes_time.py (timezone) - hermes_cli/doctor.py (memory-provider diagnostic reads effective config) All route through a new shared managed_scope.apply_managed_overlay() helper that mirrors _load_config_impl (env-only expansion so a user ${VAR} can't shadow a managed literal, root-model-string normalization, leaf-merge) and is fail-open. cli.py's earlier inline fix is refactored onto the same helper. Write-back paths (slash_commands, telegram/yuanbao dm_topics, profile distribution) are deliberately left reading raw user YAML — overlaying managed values there would persist them into the user file. The dashboard (web_server.py) already routes through load_config and needed no change. TUI loader caches the RAW config so _save_cfg never writes managed values to disk. Adds test_managed_scope_overlay.py (helper) and test_managed_scope_loaders.py (per-surface integration); mutation-checked.	2026-06-19 07:46:33 -07:00
Alex Yates	fad4b40d9d	fix(model): persist /model switch by default across sessions A plain /model <name> switch only lasted for the current session — every new session reverted to the previously-configured model, so users had to re-switch every time (e.g. glm-5.1 -> glm-5.2 on every launch). Persist-by-default is now the behavior across all three /model surfaces (CLI, gateway, TUI/dashboard), gated by a new config key model.persist_switch_by_default (default true): /model <name> switch model (persists to config.yaml) /model <name> --session switch for this session only /model <name> --global switch and persist (explicit, unchanged) The effective persistence is resolved once via resolve_persist_behavior() in hermes_cli/model_switch.py so --session opts out, --global opts in, and the config-gated default applies otherwise. --global remains a valid explicit no-op alias for the new default.	2026-06-19 07:07:06 -07:00
kyssta-exe	1699525638	fix(tui): route pending-input commands via command.dispatch (#48848 ) When /goal (and other _PENDING_INPUT_COMMANDS: retry, queue, q, steer, plan, undo) were typed in the TUI desktop app, slash.exec returned error 4018 instructing the frontend to fall back to command.dispatch. Some clients failed that client-side fallback, leaving the command empty and surfacing "empty command" — the user's typed text was silently dropped. slash.exec now routes pending-input commands to command.dispatch internally, eliminating the fragile client-side fallback hop. The response is exactly what command.dispatch would have produced, so the TUI client behaves identically once the round-trip succeeds. Salvaged from #48944 — rebased onto current main. The original PR's source change and test_goal_command.py update are correct, but it missed the second test surface: tests/tui_gateway/test_protocol.py's parametrized test_slash_exec_rejects_pending_input_commands still asserted the old 4018 rejection for retry/queue/q/steer/plan, turning CI red (5 failures). That test is rewritten here as a behavior contract: slash.exec for a pending-input command must yield the same payload as a direct command.dispatch call, and must no longer emit the old "pending-input command" fallback rejection. Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>	2026-06-19 14:53:33 +05:30
Teknium	620fd59b8e	feat(model-picker): add Refresh Models control to bust stale model cache (#48691 ) The desktop model picker had no way to force a fresh model fetch: model.options went through the 1h-cached provider_models_cache.json, and there was no flag to bust it. When a provider's cached list expired and its next live fetch failed, the picker fell back to the curated static list — silently dropping live-only models (e.g. OpenCode Zen's free tier like deepseek-v4-flash-free) the user had been using. - Thread refresh through model.options (RPC + REST /api/model/options) -> build_models_payload -> list_authenticated_providers, which calls clear_provider_models_cache() up front when set so every row re-fetches live. - Add a 'Refresh Models' control to the desktop picker (5-locale i18n, spinning sync icon). Normal opens leave refresh=false to stay snappy on the cache. Verified: stale cache hides deepseek-v4-flash-free -> refresh busts it -> live re-fetch surfaces it. refresh=false never touches the cache.	2026-06-18 21:37:41 -07:00
Brooklyn Nicholson	49596b70cb	fix(gateway): resume follows the compression tip so post-compression replies render Auto-compression ends the live session and forks a continuation child (linked via parent_session_id). A long-lived parent keeps its own flushed message rows, so resolve_resume_session_id()'s empty-head walk never redirected it — resuming the parent id reloaded the pre-compression transcript and dropped every turn generated after compression, including the assistant's response. On the desktop this is the recurring "I sent a message, came back, and the reply isn't there" report on large sessions: the chat's routed id is the pre-rotation id, and both the gateway session.resume RPC and the REST /messages read anchored on it. Fix the resolver at the chokepoint: resolve_resume_session_id() now follows the compression-continuation chain forward via get_compression_tip() before its existing empty-head descendant walk. get_compression_tip() only follows children whose parent ended with end_reason='compression' (created after the parent was ended), so delegation/branch children never hijack a resume. This fixes every resume caller at once (REST /messages, CLI --resume, gateway /resume). session.resume in tui_gateway was the one resume path that never called the resolver — it used the raw target id directly. Route it through resolve_resume_session_id() too (non-lazy only; lazy watch windows must stay on their exact child branch). Resolving up front also re-anchors the live-session fast path so a still-live rotated session is reused by its new key instead of rebuilding a duplicate agent on the stale parent. Tests: - resolve_resume_session_id follows the tip even when the parent retains messages, and is not confused by a delegation child. - session.resume binds the agent to the continuation tip and returns the post-compression reply.	2026-06-18 15:56:43 -05:00
islam666	9705e7944a	fix(picker): remove max_models=50 cap in interactive model pickers The interactive model pickers (Desktop REST API, TUI model.options, CLI /model) were hard-capped at max_models=50, which truncated large provider catalogs like Kilo Gateway (336 models) to just 50 entries. This made most models undiscoverable via the picker search box. Changes: - Change build_models_payload() default from max_models=50 to None (unlimited) - Change list_authenticated_providers() default from max_models=8 to None - Change list_picker_providers() default from max_models=8 to None - Fix all [:max_models] slicing to handle None as 'no limit' - Remove max_models=50 from 5 interactive picker callers: * web_server.py: get_model_options (Desktop /api/model/options) * web_server.py: get_recommended_default_model * model_switch.py: prewarm_picker_cache_async * tui_gateway/server.py: model.options JSON-RPC * cli.py: HermesCLI model picker - Telegram/Discord inline keyboard picker (gateway/slash_commands.py) still passes max_models=50 explicitly — unchanged behavior. The total_models field was already in the response payload and is now meaningful since models.length == total_models for interactive pickers. Fixes #48279	2026-06-18 13:47:31 -07:00
Siddharth Balyan	73cd8622f9	feat(billing): /billing terminal billing — interactive TUI + CLI client (#45449 ) * feat(billing): nous_billing http client + BillingState core (phase 2b) Phase 2b terminal-billing client foundation: - hermes_cli/nous_billing.py: typed client for the 4 /api/billing/* endpoints (state/charge/poll/auto-top-up). Raises typed errors (BillingScopeRequired, BillingRateLimited, BillingAuthError) mapped from the live-verified contract; fail-open is the caller's job. Idempotency-Key enforced client-side. - agent/billing_view.py: surface-agnostic BillingState core + Decimal money parsing (server emits decimal strings, not 2dp), fail-open builder, idempotency-key gen, custom-amount validation. - 51 unit tests (decimal parse/format, payload tiering, error->exception matrix, fail-open, amount validation). Plan: docs/plans/2026-06-13-001-phase-2b-terminal-billing-tui-plan.md * feat(billing): billing:manage scope + lazy step-up re-auth (phase 2b) - NOUS_BILLING_MANAGE_SCOPE constant. - nous_token_has_billing_scope(): split-based scope check (no false-positive substring match). - step_up_nous_billing_scope(): re-runs the device flow requesting billing:manage, reusing the held credential's portal/inference URLs + client_id (so a preview stays a preview), persists like _login_nous but WITHOUT the model picker. Returns True iff the minted token carries the scope (False when NAS silently downscopes a non-admin / unticked grant). Lazy step-up (plan D-A): normal login path unchanged; 403 insufficient_scope from a billing call triggers this. 7 unit tests. * feat(billing): billing JSON-RPC methods for the TUI (phase 2b) billing.state / charge / charge_status / auto_reload / step_up in tui_gateway/server.py. Return STRUCTURED success envelopes (result.ok + result.error=<code>) rather than JSON-RPC-level errors, so the Ink rpc() promise always resolves and the TUI branches on the typed billing error code (insufficient_scope, rate_limited, no_payment_method, …) to render the right affordance. Money serialized as decimal STRINGS + display strings. charge mints + echoes an idempotency_key for retry reuse. 16 unit tests. * feat(billing): /billing CLI handler + command registry (phase 2b) - CommandDef("billing", subcommands=buy\|auto-reload\|limit), added to _SLACK_VIA_HERMES_ONLY so it routes via /hermes on Slack (keeps the 50-cap parity test green, same as /credits). - cli.py::_show_billing + screen helpers: all 5 screens (overview, buy→confirm→ poll, auto-reload, monthly-limit read-only). Reuses _prompt_text_input_modal / _prompt_text_input (D-C). Non-interactive (_app is None) renders text + portal deep-link, never prompts (R7). Decimal money end-to-end. 2s/5-min cancellable poll loop; 429/503 = retry not failure; settled = ledger truth. Lazy step-up on 403 insufficient_scope. no_payment_method treated as mainline funnel-to-portal. - 6 CLI tests; 156 command tests (incl. Slack/Telegram parity) green. * feat(billing): /billing Ink TUI screens + tests (phase 2b) - ui-tui/src/app/slash/commands/billing.ts: /billing TUI command covering all 5 screens — overview (text), buy <amt> → ConfirmReq → charge → non-blocking 2s/ 5-min poll loop → settled/failed/timeout branches, auto-reload <below> <to> → ConfirmReq → PATCH, limit (read-only). Reuses the existing ConfirmReq overlay (D-C) — no bespoke component. Typed-error envelope branching: insufficient_scope arms the lazy step-up confirm; no_payment_method/rate_limited/cap funnel to portal. Client-side amount validation mirrors the server (bounds + 2dp). - gatewayTypes.ts: Billing* response interfaces. - registry.ts: register billingCommands. - billingCommand.test.ts: 12 vitest cases (overview/gating/buy-confirm-poll- settled/no_payment_method/step-up/limit/auto-reload/validation). TUI build green; 12/12 vitest pass; slash tests pass once @hermes/ink is built. * docs(billing): scrub private cross-repo references NAS is a private repo — remove all references to it from the public PR: - drop the cross-repo planning doc (planning scaffolding, not a deliverable; the PR description documents the design) - replace 'NAS' / 'PR #412 preview' mentions in code + test comments with generic 'the server' / 'a preview deployment' * docs(billing): scrub final NAS reference in step-up docstring * docs(billing): drop dangling plan-doc refs The phase-2b plan doc was removed in the cross-repo scrub (`300afcc0b`) but two module docstrings still pointed at it. Drop the dead refs. * feat(billing): interactive /billing overlay + step-up UX, portal-URL & token fixes Adds the interactive /billing TUI overlay and hardens the terminal-billing client across CLI and TUI. - TUI: full /billing overlay state machine (overview to buy to confirm, auto-reload, read-only monthly limit) reusing the existing confirm overlay. - Step-up: surface the verification link in-transcript and open the browser via the TUI's own opener (the device flow runs in the headless gateway, so a printed URL was being dropped); run the step-up handler off the main loop and emit the link as an out-of-band event so the gateway stays responsive. - Step-up copy is scope-accurate ("Billing permission granted") and re-checks /state so it never claims "enabled" when the org kill-switch is still off. - Portal deep-links resolve to absolute URLs against the active portal base (the server emits them relative) - fixes a bare "/billing?topup=open" link. - Billing calls refresh an expired access token via the stored refresh token instead of reporting a false "not logged in". - Optimistic funnel: advise "set up a saved card on the portal" up front when no card is on file (advisory, not a hard gate). - Token resolution is cached briefly so the 2s charge poll loop stops re-locking + re-reading the auth store on every tick; 401 re-resolves fresh. - Remove the temporary demo-mode shims. Validation: 87 Python billing tests, 88 TS tests (billing command + gateway event handler), tsc clean, ink + ui-tui builds green. * docs(billing): add /billing TUI screenshots for PR * fix(cli): guard _last_invalidate on bare instances; update stale prompt-fallback test The UI-invalidate throttle read self._last_invalidate unconditionally, which raised AttributeError on HermesCLI instances built without __init__ (the thread-safety test's object.__new__ shell). Guard the read with getattr. The off-main-thread branch of _prompt_text_input was changed (#23185) to cancel cleanly to None instead of falling back to a bare input() that would hang on the slash-worker thread; the test still asserted the old direct-input fallback. Update it to assert the current intended behavior: returns None, calls neither run_in_terminal nor input(), and does not hang.	2026-06-19 01:53:32 +05:30
Brooklyn Nicholson	51ee5b2c94	fix(desktop,tui): surface self-improvement review summary + honor memory_notifications The "💾 Self-improvement review" summary (skill/memory updated) was invisible on two surfaces: - Desktop Electron app had no review.summary event handler — skill/memory writes happened silently. Now appends a persistent system message to the transcript (matching the Ink TUI's persistent-line semantics, not a transient toast that can be missed). - tui_gateway (backs both 'hermes --tui' and the desktop) never read display.memory_notifications, so it always behaved as 'on' and ignored a user who set 'off'/'verbose'. Added _load_memory_notifications() (mirrors the messaging gateway's bool->str normalization, defaults to 'on') and wired it to agent.memory_notifications, matching gateway/run.py and the CLI. Delivery chain now reaches all surfaces: background_review.py -> background_review_callback -> review.summary event -> desktop transcript / Ink TUI line / gateway message / CLI print.	2026-06-18 13:22:12 -05:00
Teknium	0fa7d6f660	fix(desktop): never persist or restore a named custom provider as bare "custom" (#48547 ) * Port from cline/cline#11514: encourage parallel tool calls Add a universal system-prompt guidance block telling the model to batch independent tool calls (reads, searches, web fetches, read-only commands) into a single assistant turn instead of one call per turn. The runtime already executes independent batches concurrently (read-only tools always; non-overlapping path-scoped file ops); the open-source system prompt had nothing steering the model to PRODUCE the batch. Fewer round-trips means less resent context, which compounds over a long conversation. - prompt_builder.py: new PARALLEL_TOOL_CALL_GUIDANCE block (short, static, cache-amortised) modeled on TASK_COMPLETION_GUIDANCE. - system_prompt.py: inject right after the task-completion block, gated by agent.valid_tool_names + the new toggle. - agent_init.py: read agent.parallel_tool_call_guidance (default True). - config.py: add the default under the agent section. - test_prompt_builder.py: behavior-contract tests (batching steer, dependent carve-out, length bound) — invariants, not wording snapshots. Adapted from Cline's TypeScript tool-surface guidance to hermes-agent's Python prompt-assembly architecture and config-over-env conventions. * fix(desktop): never persist or restore a named custom provider as bare "custom" Custom providers vanish from the Desktop/TUI model picker with "No LLM provider configured" — repeatedly fixed (#44062, #44109, #45578) and repeatedly regressed (#44022, #47714) because every fix only recovered the entry identity from a persisted base_url. When a session is persisted/restored with the resolved provider "custom" and NO base_url, bare "custom" leaked through verbatim; resolve_runtime_provider("custom") routes to the OpenRouter default URL with no api_key, so the next turn/resume dies. Bare "custom" is the resolved billing class shared by every named providers:/ custom_providers: entry — it is not a routable identity. Centralize the "never let bare custom escape" invariant in one helper, runtime_provider.canonical_custom_identity(), and apply it at all four leak sites in tui_gateway/server.py: - _ensure_session_db_row — the ORIGIN: first DB write seeds the bad row - _runtime_model_config — live persist - _stored_session_runtime_overrides — resume restore (heals old rows; drops unrecoverable bare custom so resume falls back to config default) - _make_agent — rebuild / per-turn The helper recovers custom:<name> from the endpoint URL when present, else from config.model.provider (the durable identity left when no base_url survived). Regression tests in test_custom_provider_session_persistence.py lock the no-base_url vector at every site so it cannot regress again.	2026-06-18 11:11:51 -07:00

1 2 3 4 5 ...

337 commits