Commit graph

12013 commits

Author SHA1 Message Date
Victor Kyriazakos
3ead2bdd0d feat(prompt): configurable per-platform system-prompt hint overrides
Add platform_hints config so an admin can append to or replace Hermes'
built-in platform hint for a single messaging platform (WhatsApp, Slack,
Telegram, ...) without affecting other platforms. Enables enterprise
managed profiles to steer platform-aware skills (e.g. invoke a custom
table-formatting skill on WhatsApp where Markdown tables don't render)
while leaving Telegram/Slack/CLI behavior unchanged.

- hermes_cli/config.py: document platform_hints in DEFAULT_CONFIG
- agent/agent_init.py: load platform_hints -> agent._platform_hint_overrides
- agent/system_prompt.py: _resolve_platform_hint() applies append/replace
  (replace wins; bare string = append shorthand); defensive on bad config
- tests: 16 cases covering append/replace/shorthand/isolation/malformed

Override only affects the platform-hint segment of the system prompt;
SOUL/context/memory tiers and general instructions are unchanged.
2026-06-18 14:28:01 -07:00
brooklyn!
2944b3c394
fix(desktop): make session delete idempotent and id-resolving (#48641)
DELETE /api/sessions/{id} was the only session endpoint that didn't
resolve the id (detail, messages, rename, export all call
resolve_session_id) and 404'd when the row was already gone. The desktop
optimistically removes the sidebar row, then RESTORES it and shows the
error on any failure — so deleting a session that had just been reaped
(empty-session hygiene) or removed by a concurrent client resurrected a
ghost row and surfaced "session not found". /goal + auto-compression churn
leaves transient empty rows that race the sidebar snapshot, which is the
exact "I deleted the empty one and got 'session not found'" report.

Resolve exact ids / unique prefixes, and treat an already-absent session
as an idempotent success — DELETE's contract is "ensure it's gone". This
mirrors the bulk-delete endpoint, which already treats ghost ids as
success.

Tests: deleting an absent id is idempotent (200, not 404); delete resolves
a unique prefix; a real session still deletes.
2026-06-18 21:16:06 +00:00
flooryyyy
f8d8f045fa feat(kanban): auto-subscribe calling session on kanban_create
When a worker calls kanban_create from inside a session that has a
persistent delivery channel, the originating session is now subscribed
to the new task's completion/block events automatically. The agent
that dispatched the task gets notified instead of having to poll.

- Gateway sessions (telegram/discord/slack): HERMES_SESSION_PLATFORM +
  HERMES_SESSION_CHAT_ID ContextVars, set by the messaging gateway.
- TUI / desktop sessions: HERMES_SESSION_KEY in the subprocess env.
  The TUI notification poller keys on platform='tui' + chat_id=<key>.
- CLI / cron / test: no persistent channel, no subscription.

Gated by kanban.auto_subscribe_on_create in config.yaml (default True).
Disable to mirror pre-feature behaviour — users who want explicit
kanban_notify-subscribe calls per task can set it to false. This
config gate addresses the design concern that got PR #19718 reverted
upstream (unconditional implicit auto-subscribe on tool-driven
kanban_create was too aggressive for orchestrator users).

HERMES_SESSION_ID is intentionally not a fallback channel — it is
set by ACP/agent subprocess telemetry for every invocation, not just
TUI, so treating it as a notification target would auto-subscribe
every CLI session and re-introduce the over-eager behaviour.

The kanban_create response now includes a 'subscribed' bool so
orchestrators can react if subscription failed (e.g. by falling
back to explicit kanban_notify-subscribe or to polling).

Includes 6 tests covering the gateway / TUI / CLI / partial-context /
gated / add_notify_sub-failure paths. All 90 tests in
test_kanban_tools.py pass; 509 broader kanban tests pass.
2026-06-18 14:10:51 -07:00
brooklyn!
1ea2b27993
Merge pull request #48633 from NousResearch/fix/resume-follows-compression-tip
fix(gateway): resume follows the compression tip so post-compression replies render
2026-06-18 16:09:35 -05:00
Brooklyn Nicholson
c23c370b8b test: narrow db._conn before raw SQL so ty stops flagging None-union access
The new compression-tip tests poke started_at/ended_at directly via
db._conn to force deterministic lineage ordering. _conn is typed
Optional[Connection], so ty flagged .execute/.commit as unresolved on
None. Bind a local and assert it's non-None first to narrow the union.
2026-06-18 16:04:58 -05:00
Brooklyn Nicholson
49596b70cb fix(gateway): resume follows the compression tip so post-compression replies render
Auto-compression ends the live session and forks a continuation child
(linked via parent_session_id). A long-lived parent keeps its own flushed
message rows, so resolve_resume_session_id()'s empty-head walk never
redirected it — resuming the parent id reloaded the pre-compression
transcript and dropped every turn generated after compression, including
the assistant's response. On the desktop this is the recurring "I sent a
message, came back, and the reply isn't there" report on large sessions:
the chat's routed id is the pre-rotation id, and both the gateway
session.resume RPC and the REST /messages read anchored on it.

Fix the resolver at the chokepoint: resolve_resume_session_id() now
follows the compression-continuation chain forward via get_compression_tip()
before its existing empty-head descendant walk. get_compression_tip() only
follows children whose parent ended with end_reason='compression' (created
after the parent was ended), so delegation/branch children never hijack a
resume. This fixes every resume caller at once (REST /messages, CLI
--resume, gateway /resume).

session.resume in tui_gateway was the one resume path that never called the
resolver — it used the raw target id directly. Route it through
resolve_resume_session_id() too (non-lazy only; lazy watch windows must
stay on their exact child branch). Resolving up front also re-anchors the
live-session fast path so a still-live rotated session is reused by its new
key instead of rebuilding a duplicate agent on the stale parent.

Tests:
- resolve_resume_session_id follows the tip even when the parent retains
  messages, and is not confused by a delegation child.
- session.resume binds the agent to the continuation tip and returns the
  post-compression reply.
2026-06-18 15:56:43 -05:00
teknium1
3042045540 fix(picker): keep max_models=0 distinct from unlimited; lock cap semantics
Follow-up to the cap-removal salvage. The contributor guarded the new
unlimited default with `[:max_models] if max_models else ...`, which conflates
max_models=0 (used by slug-only callers that want an empty model list) with
None (unlimited). Tighten to `is not None` at all five slicing sites in
list_authenticated_providers / list_picker_providers, and add a regression test
asserting the three-way contract: None=full, 0=empty, N=first N.
2026-06-18 13:47:31 -07:00
islam666
9705e7944a fix(picker): remove max_models=50 cap in interactive model pickers
The interactive model pickers (Desktop REST API, TUI model.options, CLI
/model) were hard-capped at max_models=50, which truncated large provider
catalogs like Kilo Gateway (336 models) to just 50 entries. This made
most models undiscoverable via the picker search box.

Changes:
- Change build_models_payload() default from max_models=50 to None (unlimited)
- Change list_authenticated_providers() default from max_models=8 to None
- Change list_picker_providers() default from max_models=8 to None
- Fix all [:max_models] slicing to handle None as 'no limit'
- Remove max_models=50 from 5 interactive picker callers:
  * web_server.py: get_model_options (Desktop /api/model/options)
  * web_server.py: get_recommended_default_model
  * model_switch.py: prewarm_picker_cache_async
  * tui_gateway/server.py: model.options JSON-RPC
  * cli.py: HermesCLI model picker
- Telegram/Discord inline keyboard picker (gateway/slash_commands.py)
  still passes max_models=50 explicitly — unchanged behavior.

The total_models field was already in the response payload and is now
meaningful since models.length == total_models for interactive pickers.

Fixes #48279
2026-06-18 13:47:31 -07:00
alelpoan
4ed2f33994
fix(thread): allow scrolling long user messages in chat history (#48619) 2026-06-18 15:44:27 -05:00
teknium1
0879d5cc8f fix(gateway): preserve original transcript when /compress rotation is skipped
The manual /compress handler called rewrite_transcript() unconditionally on
the session id returned by _compress_context(). When rotation does not occur
(e.g. _session_db unavailable, or the DB split raised), session_id is unchanged
and rewrite_transcript() DELETEs the original messages and replaces them with
only the compressed summary — permanent data loss (#44794, #39704).

Guard the rewrite on actual rotation: only overwrite when _compress_context
produced a new session id. Otherwise leave the original transcript intact and
log a warning.
2026-06-18 13:38:35 -07:00
kyssta-exe
81ff916e57 fix(agent): flush un-persisted messages before session rotation (#47202)
compress_context() rotates the session (end_session -> create_session)
mid-turn when auto-compress triggers, but never called
_flush_messages_to_session_db() first. Messages generated during the
current turn that hadn't been persisted to state.db were silently lost.

The same bug existed in cli.py:new_session() (/new command). Both paths
now flush un-persisted messages before ending the old session.
2026-06-18 13:38:35 -07:00
Siddharth Balyan
73cd8622f9
feat(billing): /billing terminal billing — interactive TUI + CLI client (#45449)
* feat(billing): nous_billing http client + BillingState core (phase 2b)

Phase 2b terminal-billing client foundation:
- hermes_cli/nous_billing.py: typed client for the 4 /api/billing/* endpoints
  (state/charge/poll/auto-top-up). Raises typed errors (BillingScopeRequired,
  BillingRateLimited, BillingAuthError) mapped from the live-verified contract;
  fail-open is the caller's job. Idempotency-Key enforced client-side.
- agent/billing_view.py: surface-agnostic BillingState core + Decimal money
  parsing (server emits decimal strings, not 2dp), fail-open builder,
  idempotency-key gen, custom-amount validation.
- 51 unit tests (decimal parse/format, payload tiering, error->exception
  matrix, fail-open, amount validation).

Plan: docs/plans/2026-06-13-001-phase-2b-terminal-billing-tui-plan.md

* feat(billing): billing:manage scope + lazy step-up re-auth (phase 2b)

- NOUS_BILLING_MANAGE_SCOPE constant.
- nous_token_has_billing_scope(): split-based scope check (no false-positive
  substring match).
- step_up_nous_billing_scope(): re-runs the device flow requesting
  billing:manage, reusing the held credential's portal/inference URLs + client_id
  (so a preview stays a preview), persists like _login_nous but WITHOUT the model
  picker. Returns True iff the minted token carries the scope (False when NAS
  silently downscopes a non-admin / unticked grant).

Lazy step-up (plan D-A): normal login path unchanged; 403 insufficient_scope
from a billing call triggers this. 7 unit tests.

* feat(billing): billing JSON-RPC methods for the TUI (phase 2b)

billing.state / charge / charge_status / auto_reload / step_up in
tui_gateway/server.py. Return STRUCTURED success envelopes (result.ok +
result.error=<code>) rather than JSON-RPC-level errors, so the Ink rpc() promise
always resolves and the TUI branches on the typed billing error code
(insufficient_scope, rate_limited, no_payment_method, …) to render the right
affordance. Money serialized as decimal STRINGS + display strings. charge mints
+ echoes an idempotency_key for retry reuse. 16 unit tests.

* feat(billing): /billing CLI handler + command registry (phase 2b)

- CommandDef("billing", subcommands=buy|auto-reload|limit), added to
  _SLACK_VIA_HERMES_ONLY so it routes via /hermes on Slack (keeps the 50-cap
  parity test green, same as /credits).
- cli.py::_show_billing + screen helpers: all 5 screens (overview, buy→confirm→
  poll, auto-reload, monthly-limit read-only). Reuses _prompt_text_input_modal /
  _prompt_text_input (D-C). Non-interactive (_app is None) renders text + portal
  deep-link, never prompts (R7). Decimal money end-to-end. 2s/5-min cancellable
  poll loop; 429/503 = retry not failure; settled = ledger truth. Lazy step-up on
  403 insufficient_scope. no_payment_method treated as mainline funnel-to-portal.
- 6 CLI tests; 156 command tests (incl. Slack/Telegram parity) green.

* feat(billing): /billing Ink TUI screens + tests (phase 2b)

- ui-tui/src/app/slash/commands/billing.ts: /billing TUI command covering all 5
  screens — overview (text), buy <amt> → ConfirmReq → charge → non-blocking 2s/
  5-min poll loop → settled/failed/timeout branches, auto-reload <below> <to> →
  ConfirmReq → PATCH, limit (read-only). Reuses the existing ConfirmReq overlay
  (D-C) — no bespoke component. Typed-error envelope branching: insufficient_scope
  arms the lazy step-up confirm; no_payment_method/rate_limited/cap funnel to
  portal. Client-side amount validation mirrors the server (bounds + 2dp).
- gatewayTypes.ts: Billing* response interfaces.
- registry.ts: register billingCommands.
- billingCommand.test.ts: 12 vitest cases (overview/gating/buy-confirm-poll-
  settled/no_payment_method/step-up/limit/auto-reload/validation).

TUI build green; 12/12 vitest pass; slash tests pass once @hermes/ink is built.

* docs(billing): scrub private cross-repo references

NAS is a private repo — remove all references to it from the public PR:
- drop the cross-repo planning doc (planning scaffolding, not a deliverable;
  the PR description documents the design)
- replace 'NAS' / 'PR #412 preview' mentions in code + test comments with
  generic 'the server' / 'a preview deployment'

* docs(billing): scrub final NAS reference in step-up docstring

* docs(billing): drop dangling plan-doc refs

The phase-2b plan doc was removed in the cross-repo scrub (300afcc0b)
but two module docstrings still pointed at it. Drop the dead refs.

* feat(billing): interactive /billing overlay + step-up UX, portal-URL & token fixes

Adds the interactive /billing TUI overlay and hardens the terminal-billing
client across CLI and TUI.

- TUI: full /billing overlay state machine (overview to buy to confirm,
  auto-reload, read-only monthly limit) reusing the existing confirm overlay.
- Step-up: surface the verification link in-transcript and open the browser
  via the TUI's own opener (the device flow runs in the headless gateway, so a
  printed URL was being dropped); run the step-up handler off the main loop and
  emit the link as an out-of-band event so the gateway stays responsive.
- Step-up copy is scope-accurate ("Billing permission granted") and re-checks
  /state so it never claims "enabled" when the org kill-switch is still off.
- Portal deep-links resolve to absolute URLs against the active portal base
  (the server emits them relative) - fixes a bare "/billing?topup=open" link.
- Billing calls refresh an expired access token via the stored refresh token
  instead of reporting a false "not logged in".
- Optimistic funnel: advise "set up a saved card on the portal" up front when
  no card is on file (advisory, not a hard gate).
- Token resolution is cached briefly so the 2s charge poll loop stops
  re-locking + re-reading the auth store on every tick; 401 re-resolves fresh.
- Remove the temporary demo-mode shims.

Validation: 87 Python billing tests, 88 TS tests (billing command + gateway
event handler), tsc clean, ink + ui-tui builds green.

* docs(billing): add /billing TUI screenshots for PR

* fix(cli): guard _last_invalidate on bare instances; update stale prompt-fallback test

The UI-invalidate throttle read self._last_invalidate unconditionally, which
raised AttributeError on HermesCLI instances built without __init__ (the
thread-safety test's object.__new__ shell). Guard the read with getattr.

The off-main-thread branch of _prompt_text_input was changed (#23185) to cancel
cleanly to None instead of falling back to a bare input() that would hang on the
slash-worker thread; the test still asserted the old direct-input fallback.
Update it to assert the current intended behavior: returns None, calls neither
run_in_terminal nor input(), and does not hang.
2026-06-19 01:53:32 +05:30
brooklyn!
81eaedd0f5
Merge pull request #48533 from NousResearch/hermes/hermes-4061c6a8
fix(prompt,desktop,tui): dedupe parallel-tool-call steer + surface self-improvement review summary
2026-06-18 13:27:07 -05:00
Brooklyn Nicholson
51ee5b2c94 fix(desktop,tui): surface self-improvement review summary + honor memory_notifications
The "💾 Self-improvement review" summary (skill/memory updated) was invisible
on two surfaces:

- Desktop Electron app had no review.summary event handler — skill/memory
  writes happened silently. Now appends a persistent system message to the
  transcript (matching the Ink TUI's persistent-line semantics, not a
  transient toast that can be missed).
- tui_gateway (backs both 'hermes --tui' and the desktop) never read
  display.memory_notifications, so it always behaved as 'on' and ignored a
  user who set 'off'/'verbose'. Added _load_memory_notifications() (mirrors
  the messaging gateway's bool->str normalization, defaults to 'on') and
  wired it to agent.memory_notifications, matching gateway/run.py and the CLI.

Delivery chain now reaches all surfaces:
background_review.py -> background_review_callback -> review.summary event ->
desktop transcript / Ink TUI line / gateway message / CLI print.
2026-06-18 13:22:12 -05:00
Brooklyn Nicholson
07e785d60a fix(prompt): dedupe parallel-tool-call steer; correct its rationale
The universal PARALLEL_TOOL_CALL_GUIDANCE block already lives on main, but it
shipped with two rough edges this change cleans up:

- It duplicated the batching steer for Google models. The
  GOOGLE_MODEL_OPERATIONAL_GUIDANCE block still carried its own
  "Parallel tool calls" bullet, so Gemini/Gemma received the instruction
  twice in one prompt. Drop the redundant bullet — the universal block is now
  the single source.
- Its comment claimed "nothing in the open-source system prompt encouraged
  batching," which was wrong: the steer existed for Google models only. Reword
  to say the gap was that every *other* model got nothing.
- Tighten the test that asserts the steer (precedence-correct), and add an
  invariant guarding against re-introducing the Google duplicate.
2026-06-18 13:22:12 -05:00
Teknium
0fa7d6f660
fix(desktop): never persist or restore a named custom provider as bare "custom" (#48547)
* Port from cline/cline#11514: encourage parallel tool calls

Add a universal system-prompt guidance block telling the model to batch
independent tool calls (reads, searches, web fetches, read-only commands)
into a single assistant turn instead of one call per turn. The runtime
already executes independent batches concurrently (read-only tools always;
non-overlapping path-scoped file ops); the open-source system prompt had
nothing steering the model to PRODUCE the batch. Fewer round-trips means
less resent context, which compounds over a long conversation.

- prompt_builder.py: new PARALLEL_TOOL_CALL_GUIDANCE block (short, static,
  cache-amortised) modeled on TASK_COMPLETION_GUIDANCE.
- system_prompt.py: inject right after the task-completion block, gated by
  agent.valid_tool_names + the new toggle.
- agent_init.py: read agent.parallel_tool_call_guidance (default True).
- config.py: add the default under the agent section.
- test_prompt_builder.py: behavior-contract tests (batching steer, dependent
  carve-out, length bound) — invariants, not wording snapshots.

Adapted from Cline's TypeScript tool-surface guidance to hermes-agent's
Python prompt-assembly architecture and config-over-env conventions.

* fix(desktop): never persist or restore a named custom provider as bare "custom"

Custom providers vanish from the Desktop/TUI model picker with
"No LLM provider configured" — repeatedly fixed (#44062, #44109, #45578)
and repeatedly regressed (#44022, #47714) because every fix only recovered
the entry identity from a persisted base_url. When a session is
persisted/restored with the resolved provider "custom" and NO base_url, bare
"custom" leaked through verbatim; resolve_runtime_provider("custom") routes to
the OpenRouter default URL with no api_key, so the next turn/resume dies.

Bare "custom" is the resolved billing class shared by every named providers:/
custom_providers: entry — it is not a routable identity. Centralize the
"never let bare custom escape" invariant in one helper,
runtime_provider.canonical_custom_identity(), and apply it at all four leak
sites in tui_gateway/server.py:

- _ensure_session_db_row  — the ORIGIN: first DB write seeds the bad row
- _runtime_model_config   — live persist
- _stored_session_runtime_overrides — resume restore (heals old rows; drops
  unrecoverable bare custom so resume falls back to config default)
- _make_agent             — rebuild / per-turn

The helper recovers custom:<name> from the endpoint URL when present, else
from config.model.provider (the durable identity left when no base_url
survived). Regression tests in test_custom_provider_session_persistence.py
lock the no-base_url vector at every site so it cannot regress again.
2026-06-18 11:11:51 -07:00
Teknium
38c8a9c10f
feat(memory): batch operations for single-turn memory updates (#48507)
The memory tool was strictly one-op-per-call. With the store running near
its char limit by design, a new add that would overflow gets rejected with
'consolidate now, then retry' -- but the model could not consolidate and add
in one call. It had to remove/replace across several turns, then retry the
add, each turn re-sending the whole conversation context. Expensive thrash.

Add an 'operations' array: a list of add/replace/remove ops applied
atomically against the FINAL char budget. The model frees space and adds new
entries in ONE call, even when an add alone would overflow. All-or-nothing:
any bad op aborts the whole batch, nothing written.

Root-cause note: the two agent-level memory interception sites
(agent_runtime_helpers.py, tool_executor.py) silently dropped any param not
in their explicit kwarg list, so 'operations' never reached the handler and
batch calls failed with 'Unknown action None'. Both now pass it through and
bridge each add/replace op to external memory providers.

Also: success response is now terminal (done=true + 'do not repeat' note,
no full-entries echo that invited re-edits); schema rewritten to lead with
the batch mechanism and an explicit one-shot stop rule (2138 -> 1476 chars).

Live-verified: near-full consolidate-and-add went 7 calls -> 1 call,
stable across 3 reps. 103 memory/approval tests + 398 background-review/
run_agent tests green; 6 new batch tests added.
2026-06-18 10:19:33 -07:00
kshitij
2fa16ec2d2
Merge pull request #48529 from kshitijk4poor/salvage-48372-eap
fix(install): relax EAP=Stop around native git/uv calls + fail-fast on uv venv failure (#48352, salvage of #48372)
2026-06-18 22:17:53 +05:30
kshitijk4poor
fd12e59e6b fix(install): fail fast when uv venv genuinely fails under relaxed EAP
PR #48372 relaxes EAP=Stop around the uv venv call so PowerShell 5.1
doesn't mistake uv's 'Using CPython ...' stderr for a terminating
NativeCommandError. But relaxing EAP also means a *genuine* uv venv
failure (exit != 0) no longer aborts on its own — Install-Venv would
continue and print 'Virtual environment ready', and in stage mode
Invoke-Stage would report ok=true, even though no venv was created.

Capture $LASTEXITCODE immediately after the relaxed call and throw on
non-zero (Pop-Location first, matching the function's other exit paths),
so the venv stage fails fast instead of falsely succeeding. This is the
explicit guard originally proposed in #48463 (devorun), composed on top
of #48372's reusable helper + regression test.

Adds a regression test asserting the uv venv exit-code capture + throw.
2026-06-18 22:11:35 +05:30
Teknium
c37fdec2d9
feat(dashboard): surface full per-MCP catalog detail; fix pip-install doc (#48520)
The dashboard MCP catalog only showed name/description/transport and a
non-clickable source. Users couldn't see what an entry connects to or runs
before installing — the exact detail the docs trust model tells them to vet.

- /api/mcp/catalog now returns transport target (url, or command+args),
  auth_type, git install source/ref + bootstrap commands, default-enabled
  tool hint, and post-install guidance per entry.
- McpPage renders the endpoint URL (http) or command+args (stdio), the git
  install source/ref, a collapsible bootstrap-commands list, setup notes,
  and the source as a clickable link when it's a URL.
- Docs: drop the 'uv pip install -e .[mcp]' quick-start step (Hermes does
  not support pip installs; MCP ships with the standard install) and note
  the dashboard now surfaces this detail.
- Strengthen the catalog endpoint test to assert the new inspection fields.
2026-06-18 09:40:56 -07:00
kshitij
4af16b5da2
Merge pull request #48206 from ehz0ah/fix/openviking-current-api-rebased
fix(openviking): adapt memory provider for current api
2026-06-18 21:53:42 +05:30
teknium1
5ffbfed193 feat(mcp-catalog): add official Unreal Engine 5.8 MCP server
Epic's experimental Unreal MCP plugin embeds an MCP server inside the
Unreal Editor process, served over local HTTP (127.0.0.1:8000/mcp by
default). HTTP transport, no auth, no install block — the user enables
the plugin in-editor and Hermes connects to the URL.

Also drops test_optional_mcps_manifests_ship_in_both_wheel_and_sdist:
it asserted wheel/sdist packaging targets for pip/Homebrew/Nix installs,
which Hermes does not support — installs run from the repo checkout, where
the catalog is discovered by directory iteration with no packaging step.
2026-06-18 09:16:40 -07:00
xxxigm
58ad6942d9
fix(tui): don't make Enter swallow trailing-space-only slash completions (#48425)
* fix(tui): don't make Enter swallow trailing-space-only slash completions

Submitting a slash command in the TUI took three Enter presses: one to
complete the name (/ex → /exit), a second that only appended the trailing
space the gateway adds to keep the classic-CLI prompt_toolkit dropdown open
(/exit → "/exit "), and a third to actually submit.

The composer's submit handler accepted the highlighted completion whenever
applying it changed the input at all, so the whitespace-only delta ate an
extra keypress. Treat a completion whose only change is trailing whitespace
on an already-complete token as "already complete" and fall through to
submit. Partial-name and argument completions (a real token change) still
accept on Enter as before.

The replace/accept logic is extracted into pure helpers (applyCompletion,
completionToApplyOnSubmit) in domain/slash.ts.

* test(tui): cover Enter/completion trailing-space behavior and isolate poller queue

- completionApply.test.ts asserts completionToApplyOnSubmit accepts real
  token completions (partial command name, argument) but returns null for a
  trailing-space-only delta on an already-complete command, so Enter submits
  instead of needing extra presses.
- test_notification_poller_delivers_completion / _skips_consumed previously
  shared the process-global process_registry.completion_queue. Their events
  carry no session_key, so a leaked/concurrent poller could dequeue and
  dispatch them to a fixture agent without run_conversation, flaking CI
  ("AttributeError: '_FakeAgent' object has no attribute 'run_conversation'").
  Isolate the queue per test (fresh queue.Queue via monkeypatch), matching the
  sibling poller tests that already do this.
2026-06-18 11:04:59 -05:00
Teknium
25c590ccd0 fix(skills): refuse SKILLS_DIR root in rmtree guard, not just outside-tree
The salvaged guard allowed _rmtree_writable(SKILLS_DIR) itself. No call
site ever passes the root — every site passes a skill subdir or its .bak
sibling — so allowing the root only preserves the #48200 footgun (a dest
that collapses to the root wipes every installed skill). Require a strict
strict-child relationship and update the test that documented the
nonexistent 'full reset' capability.
2026-06-18 08:53:35 -07:00
Kewe63
f1254c8eaf fix(skills): rmtree scope guard + default pre_update_backup to true (#48200)
Defense-in-depth fix for the silent wipe of ~/.hermes/ documented in
#48200. A `hermes update --yes` run silently destroyed a user's
.env, MEMORY.md, kanban.db, custom skills, and scripts. Two changes:

1. `_rmtree_writable` in tools/skills_sync.py now refuses to rmtree
   anything outside SKILLS_DIR (the HERMES_HOME/skills/ root).
   All five call sites pass paths under SKILLS_DIR, so the guard is
   a no-op for current code and a loud, recoverable failure for
   any future regression (bad path join, malicious bundled
   manifest, stale path in scope after an exception).

2. The default `updates.pre_update_backup` flips from false to
   true in hermes_cli/config.py. A few minutes of zip per update
   is negligible compared to silent total data loss. Still
   overridable; --no-backup still works for one-off opt-out.

Five new tests in TestRmtreeWritableScopeGuard (root path,
hermes home, sibling dir, skills root itself, subdir) plus a
flipped `test_default_enabled_creates_backup` in test_backup.py.
178/178 tests pass in the two affected files. Public method
signatures unchanged, no test-stub blast radius.

Closes #48200
2026-06-18 08:53:35 -07:00
Teknium
41babc702e chore(release): map iamlukethedev to AUTHOR_MAP 2026-06-18 08:53:31 -07:00
Luke The Dev
3c3ac19d9c fix(#37878): Address review feedback — fix trailing whitespace and add ANTHROPIC_API_KEY test
Review feedback from egilewski:
1. Remove trailing whitespace from test docstring and mock patches (lines 1430, 1469, 1476, 1482)
2. Expand test coverage: also verify ANTHROPIC_API_KEY is stripped (not just OPENAI_API_KEY)

Changes:
- Remove trailing whitespace from test file
- Add ANTHROPIC_API_KEY to test environment
- Add assertion verifying ANTHROPIC_API_KEY is stripped from cua-driver subprocess env
- Syntax verified: python3 -m py_compile tests/tools/test_computer_use.py ✓
2026-06-18 08:53:31 -07:00
Luke The Dev
2e5c04aaf7 fix(#37878): scrub operator environment before launching cua-driver MCP
- Use _sanitize_subprocess_env() to filter Hermes-managed credentials
  from the cua-driver subprocess environment (issue #37878)
- Prevents credential exfiltration to the third-party cua-driver binary
- Aligns with existing pattern used by browser-tool and other tools
- Add regression test to verify environment sanitization

The cua-driver is a lower-trust MCP subprocess per SECURITY.md §2.3.
Its inherited environment is now scrubbed by default, removing provider
API keys, gateway tokens, and platform credentials that should not leak
to third-party binaries.

Fixes #37878
2026-06-18 08:53:31 -07:00
kshitij
b39ec2fc37
Merge pull request #48341 from xxxigm/fix/install-ps1-powershell-host-resolution
fix(install): resolve PowerShell host instead of bare `powershell` for uv install
2026-06-18 21:09:50 +05:30
Siddharth Balyan
646cd1b43e
fix(nix): refresh npmDepsHash after the Electron 40.10.2 pin (#47792) (#48457)
PR #47792 pinned Electron to an exact 40.10.2 and regenerated the root
package-lock.json (dropping @electron/get@5 + @electron-internal/extract-zip,
restoring @electron/get@2 + extract-zip@2 + yauzl), but did not refresh the
shared npmDepsHash in nix/lib.nix. The hash still described the previous
40.10.3 lockfile, so npmConfigHook fails on every Nix build with
"npmDepsHash is out of date" for hermes-tui / hermes-web / hermes-desktop.

Regenerate the single shared hash to match the current lockfile.

Verified with fetchNpmDeps (authoritative, not prefetch-npm-deps):
  nix build .#tui.npmDeps  -> builds clean
  nix build .#tui          -> Validating consistency -> Installing dependencies
                              -> Finished npmConfigHook (no hash error)
2026-06-18 15:00:08 +00:00
teknium1
ef4b897a18 chore(release): map srojk34 author email 2026-06-18 05:55:17 -07:00
srojk34
92e6d8c858 fix(desktop): dispose open PTY sessions in before-quit handler
The `before-quit` handler tears down the bootstrap controller, preview
watchers, and the Python backend but never disposes live PTY sessions.
When `app.quit()` proceeds to `FreeEnvironment()`, node-pty's
`ThreadSafeFunction::CallJS` callback fires on a half-torn-down
environment, throws a C++ exception that can no longer be caught, and
the process aborts (microsoft/node-pty#904).

Iterate `terminalSessions` and call `disposeTerminalSession()` (which
already calls `pty.kill()` + deletes the map entry) before killing the
backend, so the ThreadSafeFunctions are removed before teardown begins.

Closes #48335
2026-06-18 05:55:17 -07:00
Teknium
2f7c4858a7
fix(tui): refresh tool snapshot when MCP discovery lands after agent build (#48403)
The TUI banner reported fewer tools than the classic CLI for the same
config (e.g. 32 vs 38) when an MCP server connected slowly. Root cause:
the agent snapshots `agent.tools` once at build time and never re-reads
the registry. `_make_agent` briefly joins the background MCP discovery
thread (`wait_for_mcp_discovery`, ~0.75s) so fast servers land in that
snapshot, but a server slower than the bound — common for an HTTP MCP
server on first connect — lands *after* the agent is built. Its tools are
then absent from both the agent (uncallable until `/reload-mcp`) and the
banner for the whole session.

The classic CLI doesn't hit this because it re-derives
`get_tool_definitions()` at banner render time (which re-waits for
discovery), so it picks the late tools up.

Fix: after a fresh agent is built and its first `session.info` emitted,
if discovery is still in flight, schedule an off-critical-path daemon that
waits for it to finish, then rebuilds the tool snapshot and re-emits
`session.info` — the same rebuild `/reload-mcp` performs, but automatic.
Both the agent's callable tools and the banner count catch up.

Cache safety: the rebuild runs only while the session is still
pre-first-turn (`_user_turn_count`/`_api_call_count` both 0 → nothing
cached to invalidate). Once the user has sent a message we leave the
snapshot frozen rather than break the cached prompt prefix mid-conversation;
late tools then require an explicit `/reload-mcp` (user-consented), exactly
as today. No-op when discovery finished before the agent build, when the
join times out, when the registry was unchanged, or when the session was
swapped/closed while waiting.

Adds entry.mcp_discovery_in_flight() / join_mcp_discovery() accessors and
covers the matrix (added/none/post-turn/timeout/unchanged/replaced) with
unit tests.
2026-06-18 05:41:23 -07:00
Teknium
8abdab24c9
fix(tui): MCP headline counts connected servers, not disabled ones (#48402)
The TUI banner footer used the raw `info.mcp_servers.length`, so a
configured-but-disabled server (e.g. `linear`) was counted alongside
connected ones. With a disabled `linear` and a connected `nous-support`,
the TUI reported "2 MCP" while the classic CLI correctly reported "1 MCP"
(`mcp_connected = sum(1 for s in mcp_status if s["connected"])` in
hermes_cli/banner.py).

The collapse toggle even labels the count "connected", which was wrong
for the same reason.

Count connected servers for both the toggle and the footer segment, and
drop the `· N MCP` segment entirely when none are connected (matching the
classic banner, which only appends it when the count is > 0). The
expandable MCP section still lists every configured server, including
disabled ones.

Invariant test renders SessionPanel and asserts the headline equals the
connected count, never the configured total.
2026-06-18 05:41:19 -07:00
Tranquil-Flow
67316fdc94 fix(install): relax native stderr handling in install.ps1 (#48352) 2026-06-18 12:06:29 +02:00
xxxigm
feff283e17 test(install): lock uv installer to a resolved PowerShell host
Source-level guard (install.ps1 only runs on Windows, so there's no Linux CI
runner to execute it): the astral uv install line must be invoked via the call
operator on a resolved host variable, the bare-`powershell` literal that
produced the field-reported "The term 'powershell' is not recognized" must be
gone, and the resolver must be PATH-independent (Get-Process -Id $PID) and
pwsh-aware.
2026-06-18 16:26:34 +07:00
xxxigm
a14bae6bcc fix(install): resolve PowerShell host instead of bare powershell for uv
The Windows installer's Install-Uv spawned the astral uv installer with a
hardcoded bare `powershell -ExecutionPolicy ByPass -c "irm .../uv | iex"`.
That name resolves only to Windows PowerShell, and only when its System32
directory is on PATH. Run under PowerShell 7+ (`pwsh`) — or any session where
`powershell` isn't on PATH — the spawn dies with "The term 'powershell' is not
recognized", and uv installation aborts (the installer then appears stuck).

Add Get-PowerShellHostExe, which prefers the absolute path of the host we're
already running in (PATH-independent), then falls back to powershell/pwsh via
Get-Command, then to the bare name. Install-Uv now invokes that resolved exe.
2026-06-18 16:26:34 +07:00
qin-ctx
2a5d51c16e fix(openviking): adapt memory provider for current api
(cherry picked from commit cbb87389f3)
2026-06-18 16:58:11 +08:00
kshitij
426f321e84
Merge pull request #48299 from NousResearch/chore/author-map-infinitycrew39
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run
Typecheck / typecheck (apps/desktop) (push) Waiting to run
Typecheck / typecheck (apps/shared) (push) Waiting to run
Typecheck / typecheck (ui-tui) (push) Waiting to run
Typecheck / typecheck (web) (push) Waiting to run
Typecheck / desktop-build (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
Docker / shell lint / Lint Dockerfile (hadolint) (push) Has been cancelled
Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Has been cancelled
chore(release): map infinitycrew39 author email
2026-06-18 13:09:59 +05:30
kshitijk4poor
ca28c630c7 chore(release): map infinitycrew39 author email
Add infinitycrew39@gmail.com -> infinitycrew39 to AUTHOR_MAP so the
contributor audit resolves the two cherry-picked commits from the #47945
langfuse trace-scope salvage (merged as #48292) to a GitHub handle instead
of flagging them as an unmapped author email.
2026-06-18 13:09:34 +05:30
kshitij
9b2f7d2cb1
Merge pull request #48292 from NousResearch/fix/langfuse-trace-scope-salvage
fix(langfuse): scope trace state by turn/request ids (salvage #47945)
2026-06-18 13:08:17 +05:30
kshitijk4poor
0787ea07c8 test(langfuse): pin exact surviving key in turn-isolation test
The prior assertion `all("turn1" in k or "turn2" in k for k in keys)` was
weak on two counts: it passes vacuously when keys is empty (a regression
that lost all state would slip through), and after turn 2 finalizes only
turn 1 lingers, so it only ever inspected turn 1 anyway. Replace it with an
exact check that one key survives, it is turn 1, and turn 2 never merged
into it — the real isolation invariant the test name claims.
2026-06-18 13:00:01 +05:30
kshitijk4poor
f4fbaa6cda fix(langfuse): bound _TRACE_STATE growth from non-finalizing turns
Scoping the trace key by turn_id (the prior commit) fixed cross-turn
collisions but introduced a slow leak: _finish_trace only pops a key when a
turn ends cleanly (final response has content and no tool calls), so any
turn that is interrupted, ends on a tool call, or has empty final content
now leaves its uniquely-keyed entry in _TRACE_STATE forever. Previously the
constant per-session key was overwritten by the next turn, capping growth at
~1 entry per session.

Add an LRU cap (_MAX_TRACE_STATE) enforced by _evict_stale_locked, called
under _STATE_LOCK immediately before each insert. It evicts the
least-recently-updated entries (using the previously-dead last_updated_at
field) and ends their root span so nothing dangles. Regression test drives
50 non-finalizing turns against a cap of 8 and asserts the dict stays bounded
with the most-recent turns surviving.
2026-06-18 12:59:41 +05:30
kshitijk4poor
e1d10ec1ed refactor(langfuse): extract _scope_prefix from _trace_key
The turn- and api-scoped branches each repeated the same
task/session/thread fallback ladder with only the infix differing. Extract
the shared prefix into _scope_prefix so a future scope dimension touches one
ladder instead of three. The legacy branch still returns a bare task_id (not
the task: prefix) for backward compatibility, so it stays separate.

Output key strings are unchanged; a new test pins them across every
task/session/turn/api combination since the keys are matched across hooks
and any drift would silently break trace finalization.
2026-06-18 12:58:24 +05:30
kshitij
860cf5133a
Merge pull request #48293 from kshitijk4poor/chore/skills-diff-cleanup
refactor(skills): dedupe file-listing + share user-modified predicate (follow-up to #48286)
2026-06-18 12:49:53 +05:30
kshitijk4poor
f6fac60e66 refactor(skills): dedupe file-listing, share user-modified predicate, trim diff contract
Cleanup pass on the salvage (behavior-preserving):

- diff_bundled_skill now uses the existing _skill_file_list() helper
  instead of reimplementing the rglob/is_file/relative_to file-set
  enumeration inline (twice).
- Extract _is_tracked_user_modification(origin_hash, user_hash) and use
  it in BOTH the sync loop and list_user_modified_bundled_skills() so the
  'kept user edit' rule can't drift between the two sites.
- _read_text_for_diff -> _read_for_diff returns (bytes, text); the binary
  branch now compares the bytes it already read instead of re-reading
  both files from disk.
- Drop the unused 'user_present' key from diff_bundled_skill's return
  contract (no consumer or test ever read it).
- test_update_modified_notice: drop the brittle '>= 2 sites' count-floor
  so consolidating the two print paths into a shared helper stays a
  welcome refactor; keep the per-site 'count notice => discovery hint'
  invariant (still mutation-tested).
2026-06-18 12:42:58 +05:30
kshitijk4poor
b4356135f2 test(langfuse): add end-to-end turn-isolation regression
The PR added helper-level tests for _trace_key but nothing exercised the
keys through the real hooks. This adds TestTurnTraceIsolation, which drives
on_pre_llm_request / on_post_llm_call across two turns of one gateway
session (task_id == session_id, unique turn_id, api_call_count reset per
turn) and asserts each turn opens its own root trace when the first turn
fails to finalize (tool-only final step). This test fails on the pre-fix
code (only one trace opened, turn 2 absorbed into turn 1) and passes with
the scoping fix.

Also pins the turn_id-over-api_request_id key precedence: the turn-scoped
post_llm_call carries no api_request_id, so it must still resolve to the
same key as the request-scoped hooks or finalization breaks.
2026-06-18 12:38:44 +05:30
infinitycrew39
40ed67ccfe test(langfuse): cover turn/api trace-key scoping 2026-06-18 12:36:35 +05:30
infinitycrew39
0b54a33a34 fix(langfuse): scope trace state by turn/request ids 2026-06-18 12:36:35 +05:30
kshitij
737007e335
Merge pull request #48286 from kshitijk4poor/salvage/skills-list-modified-diff
feat(skills): find & diff user-modified bundled skills (salvage of #47802)
2026-06-18 12:33:28 +05:30