Commit graph

12006 commits

Author SHA1 Message Date
islam666
9705e7944a fix(picker): remove max_models=50 cap in interactive model pickers
The interactive model pickers (Desktop REST API, TUI model.options, CLI
/model) were hard-capped at max_models=50, which truncated large provider
catalogs like Kilo Gateway (336 models) to just 50 entries. This made
most models undiscoverable via the picker search box.

Changes:
- Change build_models_payload() default from max_models=50 to None (unlimited)
- Change list_authenticated_providers() default from max_models=8 to None
- Change list_picker_providers() default from max_models=8 to None
- Fix all [:max_models] slicing to handle None as 'no limit'
- Remove max_models=50 from 5 interactive picker callers:
  * web_server.py: get_model_options (Desktop /api/model/options)
  * web_server.py: get_recommended_default_model
  * model_switch.py: prewarm_picker_cache_async
  * tui_gateway/server.py: model.options JSON-RPC
  * cli.py: HermesCLI model picker
- Telegram/Discord inline keyboard picker (gateway/slash_commands.py)
  still passes max_models=50 explicitly — unchanged behavior.

The total_models field was already in the response payload and is now
meaningful since models.length == total_models for interactive pickers.

Fixes #48279
2026-06-18 13:47:31 -07:00
alelpoan
4ed2f33994
fix(thread): allow scrolling long user messages in chat history (#48619) 2026-06-18 15:44:27 -05:00
teknium1
0879d5cc8f fix(gateway): preserve original transcript when /compress rotation is skipped
The manual /compress handler called rewrite_transcript() unconditionally on
the session id returned by _compress_context(). When rotation does not occur
(e.g. _session_db unavailable, or the DB split raised), session_id is unchanged
and rewrite_transcript() DELETEs the original messages and replaces them with
only the compressed summary — permanent data loss (#44794, #39704).

Guard the rewrite on actual rotation: only overwrite when _compress_context
produced a new session id. Otherwise leave the original transcript intact and
log a warning.
2026-06-18 13:38:35 -07:00
kyssta-exe
81ff916e57 fix(agent): flush un-persisted messages before session rotation (#47202)
compress_context() rotates the session (end_session -> create_session)
mid-turn when auto-compress triggers, but never called
_flush_messages_to_session_db() first. Messages generated during the
current turn that hadn't been persisted to state.db were silently lost.

The same bug existed in cli.py:new_session() (/new command). Both paths
now flush un-persisted messages before ending the old session.
2026-06-18 13:38:35 -07:00
Siddharth Balyan
73cd8622f9
feat(billing): /billing terminal billing — interactive TUI + CLI client (#45449)
* feat(billing): nous_billing http client + BillingState core (phase 2b)

Phase 2b terminal-billing client foundation:
- hermes_cli/nous_billing.py: typed client for the 4 /api/billing/* endpoints
  (state/charge/poll/auto-top-up). Raises typed errors (BillingScopeRequired,
  BillingRateLimited, BillingAuthError) mapped from the live-verified contract;
  fail-open is the caller's job. Idempotency-Key enforced client-side.
- agent/billing_view.py: surface-agnostic BillingState core + Decimal money
  parsing (server emits decimal strings, not 2dp), fail-open builder,
  idempotency-key gen, custom-amount validation.
- 51 unit tests (decimal parse/format, payload tiering, error->exception
  matrix, fail-open, amount validation).

Plan: docs/plans/2026-06-13-001-phase-2b-terminal-billing-tui-plan.md

* feat(billing): billing:manage scope + lazy step-up re-auth (phase 2b)

- NOUS_BILLING_MANAGE_SCOPE constant.
- nous_token_has_billing_scope(): split-based scope check (no false-positive
  substring match).
- step_up_nous_billing_scope(): re-runs the device flow requesting
  billing:manage, reusing the held credential's portal/inference URLs + client_id
  (so a preview stays a preview), persists like _login_nous but WITHOUT the model
  picker. Returns True iff the minted token carries the scope (False when NAS
  silently downscopes a non-admin / unticked grant).

Lazy step-up (plan D-A): normal login path unchanged; 403 insufficient_scope
from a billing call triggers this. 7 unit tests.

* feat(billing): billing JSON-RPC methods for the TUI (phase 2b)

billing.state / charge / charge_status / auto_reload / step_up in
tui_gateway/server.py. Return STRUCTURED success envelopes (result.ok +
result.error=<code>) rather than JSON-RPC-level errors, so the Ink rpc() promise
always resolves and the TUI branches on the typed billing error code
(insufficient_scope, rate_limited, no_payment_method, …) to render the right
affordance. Money serialized as decimal STRINGS + display strings. charge mints
+ echoes an idempotency_key for retry reuse. 16 unit tests.

* feat(billing): /billing CLI handler + command registry (phase 2b)

- CommandDef("billing", subcommands=buy|auto-reload|limit), added to
  _SLACK_VIA_HERMES_ONLY so it routes via /hermes on Slack (keeps the 50-cap
  parity test green, same as /credits).
- cli.py::_show_billing + screen helpers: all 5 screens (overview, buy→confirm→
  poll, auto-reload, monthly-limit read-only). Reuses _prompt_text_input_modal /
  _prompt_text_input (D-C). Non-interactive (_app is None) renders text + portal
  deep-link, never prompts (R7). Decimal money end-to-end. 2s/5-min cancellable
  poll loop; 429/503 = retry not failure; settled = ledger truth. Lazy step-up on
  403 insufficient_scope. no_payment_method treated as mainline funnel-to-portal.
- 6 CLI tests; 156 command tests (incl. Slack/Telegram parity) green.

* feat(billing): /billing Ink TUI screens + tests (phase 2b)

- ui-tui/src/app/slash/commands/billing.ts: /billing TUI command covering all 5
  screens — overview (text), buy <amt> → ConfirmReq → charge → non-blocking 2s/
  5-min poll loop → settled/failed/timeout branches, auto-reload <below> <to> →
  ConfirmReq → PATCH, limit (read-only). Reuses the existing ConfirmReq overlay
  (D-C) — no bespoke component. Typed-error envelope branching: insufficient_scope
  arms the lazy step-up confirm; no_payment_method/rate_limited/cap funnel to
  portal. Client-side amount validation mirrors the server (bounds + 2dp).
- gatewayTypes.ts: Billing* response interfaces.
- registry.ts: register billingCommands.
- billingCommand.test.ts: 12 vitest cases (overview/gating/buy-confirm-poll-
  settled/no_payment_method/step-up/limit/auto-reload/validation).

TUI build green; 12/12 vitest pass; slash tests pass once @hermes/ink is built.

* docs(billing): scrub private cross-repo references

NAS is a private repo — remove all references to it from the public PR:
- drop the cross-repo planning doc (planning scaffolding, not a deliverable;
  the PR description documents the design)
- replace 'NAS' / 'PR #412 preview' mentions in code + test comments with
  generic 'the server' / 'a preview deployment'

* docs(billing): scrub final NAS reference in step-up docstring

* docs(billing): drop dangling plan-doc refs

The phase-2b plan doc was removed in the cross-repo scrub (300afcc0b)
but two module docstrings still pointed at it. Drop the dead refs.

* feat(billing): interactive /billing overlay + step-up UX, portal-URL & token fixes

Adds the interactive /billing TUI overlay and hardens the terminal-billing
client across CLI and TUI.

- TUI: full /billing overlay state machine (overview to buy to confirm,
  auto-reload, read-only monthly limit) reusing the existing confirm overlay.
- Step-up: surface the verification link in-transcript and open the browser
  via the TUI's own opener (the device flow runs in the headless gateway, so a
  printed URL was being dropped); run the step-up handler off the main loop and
  emit the link as an out-of-band event so the gateway stays responsive.
- Step-up copy is scope-accurate ("Billing permission granted") and re-checks
  /state so it never claims "enabled" when the org kill-switch is still off.
- Portal deep-links resolve to absolute URLs against the active portal base
  (the server emits them relative) - fixes a bare "/billing?topup=open" link.
- Billing calls refresh an expired access token via the stored refresh token
  instead of reporting a false "not logged in".
- Optimistic funnel: advise "set up a saved card on the portal" up front when
  no card is on file (advisory, not a hard gate).
- Token resolution is cached briefly so the 2s charge poll loop stops
  re-locking + re-reading the auth store on every tick; 401 re-resolves fresh.
- Remove the temporary demo-mode shims.

Validation: 87 Python billing tests, 88 TS tests (billing command + gateway
event handler), tsc clean, ink + ui-tui builds green.

* docs(billing): add /billing TUI screenshots for PR

* fix(cli): guard _last_invalidate on bare instances; update stale prompt-fallback test

The UI-invalidate throttle read self._last_invalidate unconditionally, which
raised AttributeError on HermesCLI instances built without __init__ (the
thread-safety test's object.__new__ shell). Guard the read with getattr.

The off-main-thread branch of _prompt_text_input was changed (#23185) to cancel
cleanly to None instead of falling back to a bare input() that would hang on the
slash-worker thread; the test still asserted the old direct-input fallback.
Update it to assert the current intended behavior: returns None, calls neither
run_in_terminal nor input(), and does not hang.
2026-06-19 01:53:32 +05:30
brooklyn!
81eaedd0f5
Merge pull request #48533 from NousResearch/hermes/hermes-4061c6a8
fix(prompt,desktop,tui): dedupe parallel-tool-call steer + surface self-improvement review summary
2026-06-18 13:27:07 -05:00
Brooklyn Nicholson
51ee5b2c94 fix(desktop,tui): surface self-improvement review summary + honor memory_notifications
The "💾 Self-improvement review" summary (skill/memory updated) was invisible
on two surfaces:

- Desktop Electron app had no review.summary event handler — skill/memory
  writes happened silently. Now appends a persistent system message to the
  transcript (matching the Ink TUI's persistent-line semantics, not a
  transient toast that can be missed).
- tui_gateway (backs both 'hermes --tui' and the desktop) never read
  display.memory_notifications, so it always behaved as 'on' and ignored a
  user who set 'off'/'verbose'. Added _load_memory_notifications() (mirrors
  the messaging gateway's bool->str normalization, defaults to 'on') and
  wired it to agent.memory_notifications, matching gateway/run.py and the CLI.

Delivery chain now reaches all surfaces:
background_review.py -> background_review_callback -> review.summary event ->
desktop transcript / Ink TUI line / gateway message / CLI print.
2026-06-18 13:22:12 -05:00
Brooklyn Nicholson
07e785d60a fix(prompt): dedupe parallel-tool-call steer; correct its rationale
The universal PARALLEL_TOOL_CALL_GUIDANCE block already lives on main, but it
shipped with two rough edges this change cleans up:

- It duplicated the batching steer for Google models. The
  GOOGLE_MODEL_OPERATIONAL_GUIDANCE block still carried its own
  "Parallel tool calls" bullet, so Gemini/Gemma received the instruction
  twice in one prompt. Drop the redundant bullet — the universal block is now
  the single source.
- Its comment claimed "nothing in the open-source system prompt encouraged
  batching," which was wrong: the steer existed for Google models only. Reword
  to say the gap was that every *other* model got nothing.
- Tighten the test that asserts the steer (precedence-correct), and add an
  invariant guarding against re-introducing the Google duplicate.
2026-06-18 13:22:12 -05:00
Teknium
0fa7d6f660
fix(desktop): never persist or restore a named custom provider as bare "custom" (#48547)
* Port from cline/cline#11514: encourage parallel tool calls

Add a universal system-prompt guidance block telling the model to batch
independent tool calls (reads, searches, web fetches, read-only commands)
into a single assistant turn instead of one call per turn. The runtime
already executes independent batches concurrently (read-only tools always;
non-overlapping path-scoped file ops); the open-source system prompt had
nothing steering the model to PRODUCE the batch. Fewer round-trips means
less resent context, which compounds over a long conversation.

- prompt_builder.py: new PARALLEL_TOOL_CALL_GUIDANCE block (short, static,
  cache-amortised) modeled on TASK_COMPLETION_GUIDANCE.
- system_prompt.py: inject right after the task-completion block, gated by
  agent.valid_tool_names + the new toggle.
- agent_init.py: read agent.parallel_tool_call_guidance (default True).
- config.py: add the default under the agent section.
- test_prompt_builder.py: behavior-contract tests (batching steer, dependent
  carve-out, length bound) — invariants, not wording snapshots.

Adapted from Cline's TypeScript tool-surface guidance to hermes-agent's
Python prompt-assembly architecture and config-over-env conventions.

* fix(desktop): never persist or restore a named custom provider as bare "custom"

Custom providers vanish from the Desktop/TUI model picker with
"No LLM provider configured" — repeatedly fixed (#44062, #44109, #45578)
and repeatedly regressed (#44022, #47714) because every fix only recovered
the entry identity from a persisted base_url. When a session is
persisted/restored with the resolved provider "custom" and NO base_url, bare
"custom" leaked through verbatim; resolve_runtime_provider("custom") routes to
the OpenRouter default URL with no api_key, so the next turn/resume dies.

Bare "custom" is the resolved billing class shared by every named providers:/
custom_providers: entry — it is not a routable identity. Centralize the
"never let bare custom escape" invariant in one helper,
runtime_provider.canonical_custom_identity(), and apply it at all four leak
sites in tui_gateway/server.py:

- _ensure_session_db_row  — the ORIGIN: first DB write seeds the bad row
- _runtime_model_config   — live persist
- _stored_session_runtime_overrides — resume restore (heals old rows; drops
  unrecoverable bare custom so resume falls back to config default)
- _make_agent             — rebuild / per-turn

The helper recovers custom:<name> from the endpoint URL when present, else
from config.model.provider (the durable identity left when no base_url
survived). Regression tests in test_custom_provider_session_persistence.py
lock the no-base_url vector at every site so it cannot regress again.
2026-06-18 11:11:51 -07:00
Teknium
38c8a9c10f
feat(memory): batch operations for single-turn memory updates (#48507)
The memory tool was strictly one-op-per-call. With the store running near
its char limit by design, a new add that would overflow gets rejected with
'consolidate now, then retry' -- but the model could not consolidate and add
in one call. It had to remove/replace across several turns, then retry the
add, each turn re-sending the whole conversation context. Expensive thrash.

Add an 'operations' array: a list of add/replace/remove ops applied
atomically against the FINAL char budget. The model frees space and adds new
entries in ONE call, even when an add alone would overflow. All-or-nothing:
any bad op aborts the whole batch, nothing written.

Root-cause note: the two agent-level memory interception sites
(agent_runtime_helpers.py, tool_executor.py) silently dropped any param not
in their explicit kwarg list, so 'operations' never reached the handler and
batch calls failed with 'Unknown action None'. Both now pass it through and
bridge each add/replace op to external memory providers.

Also: success response is now terminal (done=true + 'do not repeat' note,
no full-entries echo that invited re-edits); schema rewritten to lead with
the batch mechanism and an explicit one-shot stop rule (2138 -> 1476 chars).

Live-verified: near-full consolidate-and-add went 7 calls -> 1 call,
stable across 3 reps. 103 memory/approval tests + 398 background-review/
run_agent tests green; 6 new batch tests added.
2026-06-18 10:19:33 -07:00
kshitij
2fa16ec2d2
Merge pull request #48529 from kshitijk4poor/salvage-48372-eap
fix(install): relax EAP=Stop around native git/uv calls + fail-fast on uv venv failure (#48352, salvage of #48372)
2026-06-18 22:17:53 +05:30
kshitijk4poor
fd12e59e6b fix(install): fail fast when uv venv genuinely fails under relaxed EAP
PR #48372 relaxes EAP=Stop around the uv venv call so PowerShell 5.1
doesn't mistake uv's 'Using CPython ...' stderr for a terminating
NativeCommandError. But relaxing EAP also means a *genuine* uv venv
failure (exit != 0) no longer aborts on its own — Install-Venv would
continue and print 'Virtual environment ready', and in stage mode
Invoke-Stage would report ok=true, even though no venv was created.

Capture $LASTEXITCODE immediately after the relaxed call and throw on
non-zero (Pop-Location first, matching the function's other exit paths),
so the venv stage fails fast instead of falsely succeeding. This is the
explicit guard originally proposed in #48463 (devorun), composed on top
of #48372's reusable helper + regression test.

Adds a regression test asserting the uv venv exit-code capture + throw.
2026-06-18 22:11:35 +05:30
Teknium
c37fdec2d9
feat(dashboard): surface full per-MCP catalog detail; fix pip-install doc (#48520)
The dashboard MCP catalog only showed name/description/transport and a
non-clickable source. Users couldn't see what an entry connects to or runs
before installing — the exact detail the docs trust model tells them to vet.

- /api/mcp/catalog now returns transport target (url, or command+args),
  auth_type, git install source/ref + bootstrap commands, default-enabled
  tool hint, and post-install guidance per entry.
- McpPage renders the endpoint URL (http) or command+args (stdio), the git
  install source/ref, a collapsible bootstrap-commands list, setup notes,
  and the source as a clickable link when it's a URL.
- Docs: drop the 'uv pip install -e .[mcp]' quick-start step (Hermes does
  not support pip installs; MCP ships with the standard install) and note
  the dashboard now surfaces this detail.
- Strengthen the catalog endpoint test to assert the new inspection fields.
2026-06-18 09:40:56 -07:00
kshitij
4af16b5da2
Merge pull request #48206 from ehz0ah/fix/openviking-current-api-rebased
fix(openviking): adapt memory provider for current api
2026-06-18 21:53:42 +05:30
teknium1
5ffbfed193 feat(mcp-catalog): add official Unreal Engine 5.8 MCP server
Epic's experimental Unreal MCP plugin embeds an MCP server inside the
Unreal Editor process, served over local HTTP (127.0.0.1:8000/mcp by
default). HTTP transport, no auth, no install block — the user enables
the plugin in-editor and Hermes connects to the URL.

Also drops test_optional_mcps_manifests_ship_in_both_wheel_and_sdist:
it asserted wheel/sdist packaging targets for pip/Homebrew/Nix installs,
which Hermes does not support — installs run from the repo checkout, where
the catalog is discovered by directory iteration with no packaging step.
2026-06-18 09:16:40 -07:00
xxxigm
58ad6942d9
fix(tui): don't make Enter swallow trailing-space-only slash completions (#48425)
* fix(tui): don't make Enter swallow trailing-space-only slash completions

Submitting a slash command in the TUI took three Enter presses: one to
complete the name (/ex → /exit), a second that only appended the trailing
space the gateway adds to keep the classic-CLI prompt_toolkit dropdown open
(/exit → "/exit "), and a third to actually submit.

The composer's submit handler accepted the highlighted completion whenever
applying it changed the input at all, so the whitespace-only delta ate an
extra keypress. Treat a completion whose only change is trailing whitespace
on an already-complete token as "already complete" and fall through to
submit. Partial-name and argument completions (a real token change) still
accept on Enter as before.

The replace/accept logic is extracted into pure helpers (applyCompletion,
completionToApplyOnSubmit) in domain/slash.ts.

* test(tui): cover Enter/completion trailing-space behavior and isolate poller queue

- completionApply.test.ts asserts completionToApplyOnSubmit accepts real
  token completions (partial command name, argument) but returns null for a
  trailing-space-only delta on an already-complete command, so Enter submits
  instead of needing extra presses.
- test_notification_poller_delivers_completion / _skips_consumed previously
  shared the process-global process_registry.completion_queue. Their events
  carry no session_key, so a leaked/concurrent poller could dequeue and
  dispatch them to a fixture agent without run_conversation, flaking CI
  ("AttributeError: '_FakeAgent' object has no attribute 'run_conversation'").
  Isolate the queue per test (fresh queue.Queue via monkeypatch), matching the
  sibling poller tests that already do this.
2026-06-18 11:04:59 -05:00
Teknium
25c590ccd0 fix(skills): refuse SKILLS_DIR root in rmtree guard, not just outside-tree
The salvaged guard allowed _rmtree_writable(SKILLS_DIR) itself. No call
site ever passes the root — every site passes a skill subdir or its .bak
sibling — so allowing the root only preserves the #48200 footgun (a dest
that collapses to the root wipes every installed skill). Require a strict
strict-child relationship and update the test that documented the
nonexistent 'full reset' capability.
2026-06-18 08:53:35 -07:00
Kewe63
f1254c8eaf fix(skills): rmtree scope guard + default pre_update_backup to true (#48200)
Defense-in-depth fix for the silent wipe of ~/.hermes/ documented in
#48200. A `hermes update --yes` run silently destroyed a user's
.env, MEMORY.md, kanban.db, custom skills, and scripts. Two changes:

1. `_rmtree_writable` in tools/skills_sync.py now refuses to rmtree
   anything outside SKILLS_DIR (the HERMES_HOME/skills/ root).
   All five call sites pass paths under SKILLS_DIR, so the guard is
   a no-op for current code and a loud, recoverable failure for
   any future regression (bad path join, malicious bundled
   manifest, stale path in scope after an exception).

2. The default `updates.pre_update_backup` flips from false to
   true in hermes_cli/config.py. A few minutes of zip per update
   is negligible compared to silent total data loss. Still
   overridable; --no-backup still works for one-off opt-out.

Five new tests in TestRmtreeWritableScopeGuard (root path,
hermes home, sibling dir, skills root itself, subdir) plus a
flipped `test_default_enabled_creates_backup` in test_backup.py.
178/178 tests pass in the two affected files. Public method
signatures unchanged, no test-stub blast radius.

Closes #48200
2026-06-18 08:53:35 -07:00
Teknium
41babc702e chore(release): map iamlukethedev to AUTHOR_MAP 2026-06-18 08:53:31 -07:00
Luke The Dev
3c3ac19d9c fix(#37878): Address review feedback — fix trailing whitespace and add ANTHROPIC_API_KEY test
Review feedback from egilewski:
1. Remove trailing whitespace from test docstring and mock patches (lines 1430, 1469, 1476, 1482)
2. Expand test coverage: also verify ANTHROPIC_API_KEY is stripped (not just OPENAI_API_KEY)

Changes:
- Remove trailing whitespace from test file
- Add ANTHROPIC_API_KEY to test environment
- Add assertion verifying ANTHROPIC_API_KEY is stripped from cua-driver subprocess env
- Syntax verified: python3 -m py_compile tests/tools/test_computer_use.py ✓
2026-06-18 08:53:31 -07:00
Luke The Dev
2e5c04aaf7 fix(#37878): scrub operator environment before launching cua-driver MCP
- Use _sanitize_subprocess_env() to filter Hermes-managed credentials
  from the cua-driver subprocess environment (issue #37878)
- Prevents credential exfiltration to the third-party cua-driver binary
- Aligns with existing pattern used by browser-tool and other tools
- Add regression test to verify environment sanitization

The cua-driver is a lower-trust MCP subprocess per SECURITY.md §2.3.
Its inherited environment is now scrubbed by default, removing provider
API keys, gateway tokens, and platform credentials that should not leak
to third-party binaries.

Fixes #37878
2026-06-18 08:53:31 -07:00
kshitij
b39ec2fc37
Merge pull request #48341 from xxxigm/fix/install-ps1-powershell-host-resolution
fix(install): resolve PowerShell host instead of bare `powershell` for uv install
2026-06-18 21:09:50 +05:30
Siddharth Balyan
646cd1b43e
fix(nix): refresh npmDepsHash after the Electron 40.10.2 pin (#47792) (#48457)
PR #47792 pinned Electron to an exact 40.10.2 and regenerated the root
package-lock.json (dropping @electron/get@5 + @electron-internal/extract-zip,
restoring @electron/get@2 + extract-zip@2 + yauzl), but did not refresh the
shared npmDepsHash in nix/lib.nix. The hash still described the previous
40.10.3 lockfile, so npmConfigHook fails on every Nix build with
"npmDepsHash is out of date" for hermes-tui / hermes-web / hermes-desktop.

Regenerate the single shared hash to match the current lockfile.

Verified with fetchNpmDeps (authoritative, not prefetch-npm-deps):
  nix build .#tui.npmDeps  -> builds clean
  nix build .#tui          -> Validating consistency -> Installing dependencies
                              -> Finished npmConfigHook (no hash error)
2026-06-18 15:00:08 +00:00
teknium1
ef4b897a18 chore(release): map srojk34 author email 2026-06-18 05:55:17 -07:00
srojk34
92e6d8c858 fix(desktop): dispose open PTY sessions in before-quit handler
The `before-quit` handler tears down the bootstrap controller, preview
watchers, and the Python backend but never disposes live PTY sessions.
When `app.quit()` proceeds to `FreeEnvironment()`, node-pty's
`ThreadSafeFunction::CallJS` callback fires on a half-torn-down
environment, throws a C++ exception that can no longer be caught, and
the process aborts (microsoft/node-pty#904).

Iterate `terminalSessions` and call `disposeTerminalSession()` (which
already calls `pty.kill()` + deletes the map entry) before killing the
backend, so the ThreadSafeFunctions are removed before teardown begins.

Closes #48335
2026-06-18 05:55:17 -07:00
Teknium
2f7c4858a7
fix(tui): refresh tool snapshot when MCP discovery lands after agent build (#48403)
The TUI banner reported fewer tools than the classic CLI for the same
config (e.g. 32 vs 38) when an MCP server connected slowly. Root cause:
the agent snapshots `agent.tools` once at build time and never re-reads
the registry. `_make_agent` briefly joins the background MCP discovery
thread (`wait_for_mcp_discovery`, ~0.75s) so fast servers land in that
snapshot, but a server slower than the bound — common for an HTTP MCP
server on first connect — lands *after* the agent is built. Its tools are
then absent from both the agent (uncallable until `/reload-mcp`) and the
banner for the whole session.

The classic CLI doesn't hit this because it re-derives
`get_tool_definitions()` at banner render time (which re-waits for
discovery), so it picks the late tools up.

Fix: after a fresh agent is built and its first `session.info` emitted,
if discovery is still in flight, schedule an off-critical-path daemon that
waits for it to finish, then rebuilds the tool snapshot and re-emits
`session.info` — the same rebuild `/reload-mcp` performs, but automatic.
Both the agent's callable tools and the banner count catch up.

Cache safety: the rebuild runs only while the session is still
pre-first-turn (`_user_turn_count`/`_api_call_count` both 0 → nothing
cached to invalidate). Once the user has sent a message we leave the
snapshot frozen rather than break the cached prompt prefix mid-conversation;
late tools then require an explicit `/reload-mcp` (user-consented), exactly
as today. No-op when discovery finished before the agent build, when the
join times out, when the registry was unchanged, or when the session was
swapped/closed while waiting.

Adds entry.mcp_discovery_in_flight() / join_mcp_discovery() accessors and
covers the matrix (added/none/post-turn/timeout/unchanged/replaced) with
unit tests.
2026-06-18 05:41:23 -07:00
Teknium
8abdab24c9
fix(tui): MCP headline counts connected servers, not disabled ones (#48402)
The TUI banner footer used the raw `info.mcp_servers.length`, so a
configured-but-disabled server (e.g. `linear`) was counted alongside
connected ones. With a disabled `linear` and a connected `nous-support`,
the TUI reported "2 MCP" while the classic CLI correctly reported "1 MCP"
(`mcp_connected = sum(1 for s in mcp_status if s["connected"])` in
hermes_cli/banner.py).

The collapse toggle even labels the count "connected", which was wrong
for the same reason.

Count connected servers for both the toggle and the footer segment, and
drop the `· N MCP` segment entirely when none are connected (matching the
classic banner, which only appends it when the count is > 0). The
expandable MCP section still lists every configured server, including
disabled ones.

Invariant test renders SessionPanel and asserts the headline equals the
connected count, never the configured total.
2026-06-18 05:41:19 -07:00
Tranquil-Flow
67316fdc94 fix(install): relax native stderr handling in install.ps1 (#48352) 2026-06-18 12:06:29 +02:00
xxxigm
feff283e17 test(install): lock uv installer to a resolved PowerShell host
Source-level guard (install.ps1 only runs on Windows, so there's no Linux CI
runner to execute it): the astral uv install line must be invoked via the call
operator on a resolved host variable, the bare-`powershell` literal that
produced the field-reported "The term 'powershell' is not recognized" must be
gone, and the resolver must be PATH-independent (Get-Process -Id $PID) and
pwsh-aware.
2026-06-18 16:26:34 +07:00
xxxigm
a14bae6bcc fix(install): resolve PowerShell host instead of bare powershell for uv
The Windows installer's Install-Uv spawned the astral uv installer with a
hardcoded bare `powershell -ExecutionPolicy ByPass -c "irm .../uv | iex"`.
That name resolves only to Windows PowerShell, and only when its System32
directory is on PATH. Run under PowerShell 7+ (`pwsh`) — or any session where
`powershell` isn't on PATH — the spawn dies with "The term 'powershell' is not
recognized", and uv installation aborts (the installer then appears stuck).

Add Get-PowerShellHostExe, which prefers the absolute path of the host we're
already running in (PATH-independent), then falls back to powershell/pwsh via
Get-Command, then to the bare name. Install-Uv now invokes that resolved exe.
2026-06-18 16:26:34 +07:00
qin-ctx
2a5d51c16e fix(openviking): adapt memory provider for current api
(cherry picked from commit cbb87389f3)
2026-06-18 16:58:11 +08:00
kshitij
426f321e84
Merge pull request #48299 from NousResearch/chore/author-map-infinitycrew39
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run
Typecheck / typecheck (apps/desktop) (push) Waiting to run
Typecheck / typecheck (apps/shared) (push) Waiting to run
Typecheck / typecheck (ui-tui) (push) Waiting to run
Typecheck / typecheck (web) (push) Waiting to run
Typecheck / desktop-build (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
Docker / shell lint / Lint Dockerfile (hadolint) (push) Has been cancelled
Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Has been cancelled
chore(release): map infinitycrew39 author email
2026-06-18 13:09:59 +05:30
kshitijk4poor
ca28c630c7 chore(release): map infinitycrew39 author email
Add infinitycrew39@gmail.com -> infinitycrew39 to AUTHOR_MAP so the
contributor audit resolves the two cherry-picked commits from the #47945
langfuse trace-scope salvage (merged as #48292) to a GitHub handle instead
of flagging them as an unmapped author email.
2026-06-18 13:09:34 +05:30
kshitij
9b2f7d2cb1
Merge pull request #48292 from NousResearch/fix/langfuse-trace-scope-salvage
fix(langfuse): scope trace state by turn/request ids (salvage #47945)
2026-06-18 13:08:17 +05:30
kshitijk4poor
0787ea07c8 test(langfuse): pin exact surviving key in turn-isolation test
The prior assertion `all("turn1" in k or "turn2" in k for k in keys)` was
weak on two counts: it passes vacuously when keys is empty (a regression
that lost all state would slip through), and after turn 2 finalizes only
turn 1 lingers, so it only ever inspected turn 1 anyway. Replace it with an
exact check that one key survives, it is turn 1, and turn 2 never merged
into it — the real isolation invariant the test name claims.
2026-06-18 13:00:01 +05:30
kshitijk4poor
f4fbaa6cda fix(langfuse): bound _TRACE_STATE growth from non-finalizing turns
Scoping the trace key by turn_id (the prior commit) fixed cross-turn
collisions but introduced a slow leak: _finish_trace only pops a key when a
turn ends cleanly (final response has content and no tool calls), so any
turn that is interrupted, ends on a tool call, or has empty final content
now leaves its uniquely-keyed entry in _TRACE_STATE forever. Previously the
constant per-session key was overwritten by the next turn, capping growth at
~1 entry per session.

Add an LRU cap (_MAX_TRACE_STATE) enforced by _evict_stale_locked, called
under _STATE_LOCK immediately before each insert. It evicts the
least-recently-updated entries (using the previously-dead last_updated_at
field) and ends their root span so nothing dangles. Regression test drives
50 non-finalizing turns against a cap of 8 and asserts the dict stays bounded
with the most-recent turns surviving.
2026-06-18 12:59:41 +05:30
kshitijk4poor
e1d10ec1ed refactor(langfuse): extract _scope_prefix from _trace_key
The turn- and api-scoped branches each repeated the same
task/session/thread fallback ladder with only the infix differing. Extract
the shared prefix into _scope_prefix so a future scope dimension touches one
ladder instead of three. The legacy branch still returns a bare task_id (not
the task: prefix) for backward compatibility, so it stays separate.

Output key strings are unchanged; a new test pins them across every
task/session/turn/api combination since the keys are matched across hooks
and any drift would silently break trace finalization.
2026-06-18 12:58:24 +05:30
kshitij
860cf5133a
Merge pull request #48293 from kshitijk4poor/chore/skills-diff-cleanup
refactor(skills): dedupe file-listing + share user-modified predicate (follow-up to #48286)
2026-06-18 12:49:53 +05:30
kshitijk4poor
f6fac60e66 refactor(skills): dedupe file-listing, share user-modified predicate, trim diff contract
Cleanup pass on the salvage (behavior-preserving):

- diff_bundled_skill now uses the existing _skill_file_list() helper
  instead of reimplementing the rglob/is_file/relative_to file-set
  enumeration inline (twice).
- Extract _is_tracked_user_modification(origin_hash, user_hash) and use
  it in BOTH the sync loop and list_user_modified_bundled_skills() so the
  'kept user edit' rule can't drift between the two sites.
- _read_text_for_diff -> _read_for_diff returns (bytes, text); the binary
  branch now compares the bytes it already read instead of re-reading
  both files from disk.
- Drop the unused 'user_present' key from diff_bundled_skill's return
  contract (no consumer or test ever read it).
- test_update_modified_notice: drop the brittle '>= 2 sites' count-floor
  so consolidating the two print paths into a shared helper stays a
  welcome refactor; keep the per-site 'count notice => discovery hint'
  invariant (still mutation-tested).
2026-06-18 12:42:58 +05:30
kshitijk4poor
b4356135f2 test(langfuse): add end-to-end turn-isolation regression
The PR added helper-level tests for _trace_key but nothing exercised the
keys through the real hooks. This adds TestTurnTraceIsolation, which drives
on_pre_llm_request / on_post_llm_call across two turns of one gateway
session (task_id == session_id, unique turn_id, api_call_count reset per
turn) and asserts each turn opens its own root trace when the first turn
fails to finalize (tool-only final step). This test fails on the pre-fix
code (only one trace opened, turn 2 absorbed into turn 1) and passes with
the scoping fix.

Also pins the turn_id-over-api_request_id key precedence: the turn-scoped
post_llm_call carries no api_request_id, so it must still resolve to the
same key as the request-scoped hooks or finalization breaks.
2026-06-18 12:38:44 +05:30
infinitycrew39
40ed67ccfe test(langfuse): cover turn/api trace-key scoping 2026-06-18 12:36:35 +05:30
infinitycrew39
0b54a33a34 fix(langfuse): scope trace state by turn/request ids 2026-06-18 12:36:35 +05:30
kshitij
737007e335
Merge pull request #48286 from kshitijk4poor/salvage/skills-list-modified-diff
feat(skills): find & diff user-modified bundled skills (salvage of #47802)
2026-06-18 12:33:28 +05:30
kshitijk4poor
6777916068 fix(skills): surface list-modified hint on both update paths + disambiguate diff
Salvage follow-up to the cherry-picked feat/test commits:

- W1: the unpack/install update path in main.py printed the
  '~ N user-modified (kept)' notice without the new
  'hermes skills list-modified' hint that the git-pull path got.
  Mirror the hint to both sites so the count is actionable
  regardless of which update path runs.
- W2: 'hermes skills diff <name>' (bundled-vs-stock) now shares the
  verb with the gateway write-approval 'diff <id>'. The gateway
  handler's docstring + truncation message pointed users to
  '/skills diff <id>' on the CLI, which now resolves a bundled skill
  by that name instead. Point at the pending JSON file and note the
  two diff commands are distinct.
- Add an invariant test asserting every 'user-modified (kept)' notice
  in main.py carries the discovery hint (guards sibling drift).
2026-06-18 12:28:11 +05:30
xxxigm
481f0417d8 test(skills): cover list-modified + diff for bundled skills
Exercises the real sync pipeline (no mocked comparison logic): a pristine
synced skill is not flagged; an edited one is listed and diffed (modified +
added files); an unknown skill returns not-ok; and `reset --restore` clears
the modified state so revert and discovery stay consistent.
2026-06-18 12:26:20 +05:30
xxxigm
085fc5d001 feat(skills): find & diff user-modified bundled skills
`hermes update` keeps (won't overwrite) bundled skills the user edited
locally, but only printed a count — "~ N user-modified (kept)" — with no way
to learn which skills, or see what changed. Reverting already existed
(`hermes skills reset <name> [--restore]`); discovery and inspection did not.

Add two CLI commands (zero model-tool footprint), reusing the manifest
origin-hash that sync already maintains:

- `hermes skills list-modified [--json]` — list the bundled skills whose
  on-disk copy diverges from the last-synced origin hash (the exact test the
  sync loop uses to decide what to skip).
- `hermes skills diff <name>` — unified diff between the user's copy and the
  current bundled (stock) version, so the user can confirm what changed
  before reverting.

Both are mirrored as `/skills list-modified` and `/skills diff`. The
`hermes update` notice now points at `hermes skills list-modified`. Core
helpers `list_user_modified_bundled_skills()` and `diff_bundled_skill()` live
in tools/skills_sync.py alongside the existing reset logic.
2026-06-18 12:26:20 +05:30
kshitij
edcde6b26f
Merge pull request #48265 from kshitijk4poor/chore/ov-atomic-json-write
refactor(openviking): reuse atomic_json_write for ovcli config; drop dead constants
2026-06-18 11:45:30 +05:30
kshitijk4poor
5494c1e9b6 refactor(openviking): reuse atomic_json_write for ovcli config; drop dead constants
Follow-up cleanup on the OpenViking setup path merged in #48262:

- _write_ovcli_config now uses utils.atomic_json_write(path, data, mode=0o600)
  instead of the local _precreate_secret_file + write_text + chmod sequence.
  The shared helper (already used by honcho/mem0/supermemory/hindsight) writes
  via temp-file + fchmod(0600) + fsync + os.replace, so the ovcli.conf is
  written atomically (no half-written secret file on crash) and with no
  chmod-after-write TOCTOU window. _precreate_secret_file stays for the .env
  writer path.
- Remove dead _DEFAULT_ACCOUNT/_DEFAULT_USER constants (0 references; the
  empty->'default' tenant fallback lives in the _VikingClient constructor).

Tests: tests/plugins/memory/test_openviking_provider.py + test_memory_setup.py
+ openviking_plugin/test_openviking.py -> 130 passed; ruff clean.
2026-06-18 11:40:11 +05:30
kshitij
832d5967f8
Merge pull request #48262 from kshitijk4poor/salvage-32445
feat(memory): improve OpenViking setup UX (salvage #32445)
2026-06-18 11:34:11 +05:30
Ben Barclay
eaa0984210
chore: drop committed PR-infographic assets from the repo (#48261)
PR infographics are decorative visual hooks for a PR body, not repo
artifacts. The established convention (commit 5772e638c, "chore: drop
in-repo infographic/ directory; keep PR-body URLs only", #30854) is to
hotlink an externally-hosted image so GitHub camo-proxies it inline,
leaving zero binary footprint in the tree.

Two such assets had been committed anyway and are referenced nowhere in
the codebase:

- docs/assets/ns504-chat-session-reconnect.png (1024-equiv, NS-504 PR
  infographic, added in #47674 alongside the ChatPage.tsx fix)
- infographic/kanban-db-corruption-defense/infographic.png (re-added a
  directory #30854 had explicitly removed, in #30952)

Both are unreferenced decorative infographics, so removing them has no
effect on docs, website, or app builds. Removing the latter also clears
the stray top-level infographic/ directory that #30854 had retired.

These blobs remain in history (the commits that introduced them are
already on main and bundled with real code, so they can't be dropped);
this just removes them from the working tree going forward.
2026-06-18 16:03:29 +10:00