hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-21 10:22:18 +00:00

Author	SHA1	Message	Date
Victor Kyriazakos	3ead2bdd0d	feat(prompt): configurable per-platform system-prompt hint overrides Add platform_hints config so an admin can append to or replace Hermes' built-in platform hint for a single messaging platform (WhatsApp, Slack, Telegram, ...) without affecting other platforms. Enables enterprise managed profiles to steer platform-aware skills (e.g. invoke a custom table-formatting skill on WhatsApp where Markdown tables don't render) while leaving Telegram/Slack/CLI behavior unchanged. - hermes_cli/config.py: document platform_hints in DEFAULT_CONFIG - agent/agent_init.py: load platform_hints -> agent._platform_hint_overrides - agent/system_prompt.py: _resolve_platform_hint() applies append/replace (replace wins; bare string = append shorthand); defensive on bad config - tests: 16 cases covering append/replace/shorthand/isolation/malformed Override only affects the platform-hint segment of the system prompt; SOUL/context/memory tiers and general instructions are unchanged.	2026-06-18 14:28:01 -07:00
brooklyn!	2944b3c394	fix(desktop): make session delete idempotent and id-resolving (#48641 ) DELETE /api/sessions/{id} was the only session endpoint that didn't resolve the id (detail, messages, rename, export all call resolve_session_id) and 404'd when the row was already gone. The desktop optimistically removes the sidebar row, then RESTORES it and shows the error on any failure — so deleting a session that had just been reaped (empty-session hygiene) or removed by a concurrent client resurrected a ghost row and surfaced "session not found". /goal + auto-compression churn leaves transient empty rows that race the sidebar snapshot, which is the exact "I deleted the empty one and got 'session not found'" report. Resolve exact ids / unique prefixes, and treat an already-absent session as an idempotent success — DELETE's contract is "ensure it's gone". This mirrors the bulk-delete endpoint, which already treats ghost ids as success. Tests: deleting an absent id is idempotent (200, not 404); delete resolves a unique prefix; a real session still deletes.	2026-06-18 21:16:06 +00:00
flooryyyy	f8d8f045fa	feat(kanban): auto-subscribe calling session on kanban_create When a worker calls kanban_create from inside a session that has a persistent delivery channel, the originating session is now subscribed to the new task's completion/block events automatically. The agent that dispatched the task gets notified instead of having to poll. - Gateway sessions (telegram/discord/slack): HERMES_SESSION_PLATFORM + HERMES_SESSION_CHAT_ID ContextVars, set by the messaging gateway. - TUI / desktop sessions: HERMES_SESSION_KEY in the subprocess env. The TUI notification poller keys on platform='tui' + chat_id=<key>. - CLI / cron / test: no persistent channel, no subscription. Gated by kanban.auto_subscribe_on_create in config.yaml (default True). Disable to mirror pre-feature behaviour — users who want explicit kanban_notify-subscribe calls per task can set it to false. This config gate addresses the design concern that got PR #19718 reverted upstream (unconditional implicit auto-subscribe on tool-driven kanban_create was too aggressive for orchestrator users). HERMES_SESSION_ID is intentionally not a fallback channel — it is set by ACP/agent subprocess telemetry for every invocation, not just TUI, so treating it as a notification target would auto-subscribe every CLI session and re-introduce the over-eager behaviour. The kanban_create response now includes a 'subscribed' bool so orchestrators can react if subscription failed (e.g. by falling back to explicit kanban_notify-subscribe or to polling). Includes 6 tests covering the gateway / TUI / CLI / partial-context / gated / add_notify_sub-failure paths. All 90 tests in test_kanban_tools.py pass; 509 broader kanban tests pass.	2026-06-18 14:10:51 -07:00
brooklyn!	1ea2b27993	Merge pull request #48633 from NousResearch/fix/resume-follows-compression-tip fix(gateway): resume follows the compression tip so post-compression replies render	2026-06-18 16:09:35 -05:00
Brooklyn Nicholson	c23c370b8b	test: narrow db._conn before raw SQL so ty stops flagging None-union access The new compression-tip tests poke started_at/ended_at directly via db._conn to force deterministic lineage ordering. _conn is typed Optional[Connection], so ty flagged .execute/.commit as unresolved on None. Bind a local and assert it's non-None first to narrow the union.	2026-06-18 16:04:58 -05:00
Brooklyn Nicholson	49596b70cb	fix(gateway): resume follows the compression tip so post-compression replies render Auto-compression ends the live session and forks a continuation child (linked via parent_session_id). A long-lived parent keeps its own flushed message rows, so resolve_resume_session_id()'s empty-head walk never redirected it — resuming the parent id reloaded the pre-compression transcript and dropped every turn generated after compression, including the assistant's response. On the desktop this is the recurring "I sent a message, came back, and the reply isn't there" report on large sessions: the chat's routed id is the pre-rotation id, and both the gateway session.resume RPC and the REST /messages read anchored on it. Fix the resolver at the chokepoint: resolve_resume_session_id() now follows the compression-continuation chain forward via get_compression_tip() before its existing empty-head descendant walk. get_compression_tip() only follows children whose parent ended with end_reason='compression' (created after the parent was ended), so delegation/branch children never hijack a resume. This fixes every resume caller at once (REST /messages, CLI --resume, gateway /resume). session.resume in tui_gateway was the one resume path that never called the resolver — it used the raw target id directly. Route it through resolve_resume_session_id() too (non-lazy only; lazy watch windows must stay on their exact child branch). Resolving up front also re-anchors the live-session fast path so a still-live rotated session is reused by its new key instead of rebuilding a duplicate agent on the stale parent. Tests: - resolve_resume_session_id follows the tip even when the parent retains messages, and is not confused by a delegation child. - session.resume binds the agent to the continuation tip and returns the post-compression reply.	2026-06-18 15:56:43 -05:00
teknium1	3042045540	fix(picker): keep max_models=0 distinct from unlimited; lock cap semantics Follow-up to the cap-removal salvage. The contributor guarded the new unlimited default with `[:max_models] if max_models else ...`, which conflates max_models=0 (used by slug-only callers that want an empty model list) with None (unlimited). Tighten to `is not None` at all five slicing sites in list_authenticated_providers / list_picker_providers, and add a regression test asserting the three-way contract: None=full, 0=empty, N=first N.	2026-06-18 13:47:31 -07:00
islam666	9705e7944a	fix(picker): remove max_models=50 cap in interactive model pickers The interactive model pickers (Desktop REST API, TUI model.options, CLI /model) were hard-capped at max_models=50, which truncated large provider catalogs like Kilo Gateway (336 models) to just 50 entries. This made most models undiscoverable via the picker search box. Changes: - Change build_models_payload() default from max_models=50 to None (unlimited) - Change list_authenticated_providers() default from max_models=8 to None - Change list_picker_providers() default from max_models=8 to None - Fix all [:max_models] slicing to handle None as 'no limit' - Remove max_models=50 from 5 interactive picker callers: * web_server.py: get_model_options (Desktop /api/model/options) * web_server.py: get_recommended_default_model * model_switch.py: prewarm_picker_cache_async * tui_gateway/server.py: model.options JSON-RPC * cli.py: HermesCLI model picker - Telegram/Discord inline keyboard picker (gateway/slash_commands.py) still passes max_models=50 explicitly — unchanged behavior. The total_models field was already in the response payload and is now meaningful since models.length == total_models for interactive pickers. Fixes #48279	2026-06-18 13:47:31 -07:00
alelpoan	4ed2f33994	fix(thread): allow scrolling long user messages in chat history (#48619 )	2026-06-18 15:44:27 -05:00
teknium1	0879d5cc8f	fix(gateway): preserve original transcript when /compress rotation is skipped The manual /compress handler called rewrite_transcript() unconditionally on the session id returned by _compress_context(). When rotation does not occur (e.g. _session_db unavailable, or the DB split raised), session_id is unchanged and rewrite_transcript() DELETEs the original messages and replaces them with only the compressed summary — permanent data loss (#44794, #39704). Guard the rewrite on actual rotation: only overwrite when _compress_context produced a new session id. Otherwise leave the original transcript intact and log a warning.	2026-06-18 13:38:35 -07:00
kyssta-exe	81ff916e57	fix(agent): flush un-persisted messages before session rotation (#47202 ) compress_context() rotates the session (end_session -> create_session) mid-turn when auto-compress triggers, but never called _flush_messages_to_session_db() first. Messages generated during the current turn that hadn't been persisted to state.db were silently lost. The same bug existed in cli.py:new_session() (/new command). Both paths now flush un-persisted messages before ending the old session.	2026-06-18 13:38:35 -07:00
Siddharth Balyan	73cd8622f9	feat(billing): /billing terminal billing — interactive TUI + CLI client (#45449 ) * feat(billing): nous_billing http client + BillingState core (phase 2b) Phase 2b terminal-billing client foundation: - hermes_cli/nous_billing.py: typed client for the 4 /api/billing/* endpoints (state/charge/poll/auto-top-up). Raises typed errors (BillingScopeRequired, BillingRateLimited, BillingAuthError) mapped from the live-verified contract; fail-open is the caller's job. Idempotency-Key enforced client-side. - agent/billing_view.py: surface-agnostic BillingState core + Decimal money parsing (server emits decimal strings, not 2dp), fail-open builder, idempotency-key gen, custom-amount validation. - 51 unit tests (decimal parse/format, payload tiering, error->exception matrix, fail-open, amount validation). Plan: docs/plans/2026-06-13-001-phase-2b-terminal-billing-tui-plan.md * feat(billing): billing:manage scope + lazy step-up re-auth (phase 2b) - NOUS_BILLING_MANAGE_SCOPE constant. - nous_token_has_billing_scope(): split-based scope check (no false-positive substring match). - step_up_nous_billing_scope(): re-runs the device flow requesting billing:manage, reusing the held credential's portal/inference URLs + client_id (so a preview stays a preview), persists like _login_nous but WITHOUT the model picker. Returns True iff the minted token carries the scope (False when NAS silently downscopes a non-admin / unticked grant). Lazy step-up (plan D-A): normal login path unchanged; 403 insufficient_scope from a billing call triggers this. 7 unit tests. * feat(billing): billing JSON-RPC methods for the TUI (phase 2b) billing.state / charge / charge_status / auto_reload / step_up in tui_gateway/server.py. Return STRUCTURED success envelopes (result.ok + result.error=<code>) rather than JSON-RPC-level errors, so the Ink rpc() promise always resolves and the TUI branches on the typed billing error code (insufficient_scope, rate_limited, no_payment_method, …) to render the right affordance. Money serialized as decimal STRINGS + display strings. charge mints + echoes an idempotency_key for retry reuse. 16 unit tests. * feat(billing): /billing CLI handler + command registry (phase 2b) - CommandDef("billing", subcommands=buy\|auto-reload\|limit), added to _SLACK_VIA_HERMES_ONLY so it routes via /hermes on Slack (keeps the 50-cap parity test green, same as /credits). - cli.py::_show_billing + screen helpers: all 5 screens (overview, buy→confirm→ poll, auto-reload, monthly-limit read-only). Reuses _prompt_text_input_modal / _prompt_text_input (D-C). Non-interactive (_app is None) renders text + portal deep-link, never prompts (R7). Decimal money end-to-end. 2s/5-min cancellable poll loop; 429/503 = retry not failure; settled = ledger truth. Lazy step-up on 403 insufficient_scope. no_payment_method treated as mainline funnel-to-portal. - 6 CLI tests; 156 command tests (incl. Slack/Telegram parity) green. * feat(billing): /billing Ink TUI screens + tests (phase 2b) - ui-tui/src/app/slash/commands/billing.ts: /billing TUI command covering all 5 screens — overview (text), buy <amt> → ConfirmReq → charge → non-blocking 2s/ 5-min poll loop → settled/failed/timeout branches, auto-reload <below> <to> → ConfirmReq → PATCH, limit (read-only). Reuses the existing ConfirmReq overlay (D-C) — no bespoke component. Typed-error envelope branching: insufficient_scope arms the lazy step-up confirm; no_payment_method/rate_limited/cap funnel to portal. Client-side amount validation mirrors the server (bounds + 2dp). - gatewayTypes.ts: Billing* response interfaces. - registry.ts: register billingCommands. - billingCommand.test.ts: 12 vitest cases (overview/gating/buy-confirm-poll- settled/no_payment_method/step-up/limit/auto-reload/validation). TUI build green; 12/12 vitest pass; slash tests pass once @hermes/ink is built. * docs(billing): scrub private cross-repo references NAS is a private repo — remove all references to it from the public PR: - drop the cross-repo planning doc (planning scaffolding, not a deliverable; the PR description documents the design) - replace 'NAS' / 'PR #412 preview' mentions in code + test comments with generic 'the server' / 'a preview deployment' * docs(billing): scrub final NAS reference in step-up docstring * docs(billing): drop dangling plan-doc refs The phase-2b plan doc was removed in the cross-repo scrub (`300afcc0b`) but two module docstrings still pointed at it. Drop the dead refs. * feat(billing): interactive /billing overlay + step-up UX, portal-URL & token fixes Adds the interactive /billing TUI overlay and hardens the terminal-billing client across CLI and TUI. - TUI: full /billing overlay state machine (overview to buy to confirm, auto-reload, read-only monthly limit) reusing the existing confirm overlay. - Step-up: surface the verification link in-transcript and open the browser via the TUI's own opener (the device flow runs in the headless gateway, so a printed URL was being dropped); run the step-up handler off the main loop and emit the link as an out-of-band event so the gateway stays responsive. - Step-up copy is scope-accurate ("Billing permission granted") and re-checks /state so it never claims "enabled" when the org kill-switch is still off. - Portal deep-links resolve to absolute URLs against the active portal base (the server emits them relative) - fixes a bare "/billing?topup=open" link. - Billing calls refresh an expired access token via the stored refresh token instead of reporting a false "not logged in". - Optimistic funnel: advise "set up a saved card on the portal" up front when no card is on file (advisory, not a hard gate). - Token resolution is cached briefly so the 2s charge poll loop stops re-locking + re-reading the auth store on every tick; 401 re-resolves fresh. - Remove the temporary demo-mode shims. Validation: 87 Python billing tests, 88 TS tests (billing command + gateway event handler), tsc clean, ink + ui-tui builds green. * docs(billing): add /billing TUI screenshots for PR * fix(cli): guard _last_invalidate on bare instances; update stale prompt-fallback test The UI-invalidate throttle read self._last_invalidate unconditionally, which raised AttributeError on HermesCLI instances built without __init__ (the thread-safety test's object.__new__ shell). Guard the read with getattr. The off-main-thread branch of _prompt_text_input was changed (#23185) to cancel cleanly to None instead of falling back to a bare input() that would hang on the slash-worker thread; the test still asserted the old direct-input fallback. Update it to assert the current intended behavior: returns None, calls neither run_in_terminal nor input(), and does not hang.	2026-06-19 01:53:32 +05:30
brooklyn!	81eaedd0f5	Merge pull request #48533 from NousResearch/hermes/hermes-4061c6a8 fix(prompt,desktop,tui): dedupe parallel-tool-call steer + surface self-improvement review summary	2026-06-18 13:27:07 -05:00
Brooklyn Nicholson	51ee5b2c94	fix(desktop,tui): surface self-improvement review summary + honor memory_notifications The "💾 Self-improvement review" summary (skill/memory updated) was invisible on two surfaces: - Desktop Electron app had no review.summary event handler — skill/memory writes happened silently. Now appends a persistent system message to the transcript (matching the Ink TUI's persistent-line semantics, not a transient toast that can be missed). - tui_gateway (backs both 'hermes --tui' and the desktop) never read display.memory_notifications, so it always behaved as 'on' and ignored a user who set 'off'/'verbose'. Added _load_memory_notifications() (mirrors the messaging gateway's bool->str normalization, defaults to 'on') and wired it to agent.memory_notifications, matching gateway/run.py and the CLI. Delivery chain now reaches all surfaces: background_review.py -> background_review_callback -> review.summary event -> desktop transcript / Ink TUI line / gateway message / CLI print.	2026-06-18 13:22:12 -05:00
Brooklyn Nicholson	07e785d60a	fix(prompt): dedupe parallel-tool-call steer; correct its rationale The universal PARALLEL_TOOL_CALL_GUIDANCE block already lives on main, but it shipped with two rough edges this change cleans up: - It duplicated the batching steer for Google models. The GOOGLE_MODEL_OPERATIONAL_GUIDANCE block still carried its own "Parallel tool calls" bullet, so Gemini/Gemma received the instruction twice in one prompt. Drop the redundant bullet — the universal block is now the single source. - Its comment claimed "nothing in the open-source system prompt encouraged batching," which was wrong: the steer existed for Google models only. Reword to say the gap was that every other model got nothing. - Tighten the test that asserts the steer (precedence-correct), and add an invariant guarding against re-introducing the Google duplicate.	2026-06-18 13:22:12 -05:00
Teknium	0fa7d6f660	fix(desktop): never persist or restore a named custom provider as bare "custom" (#48547 ) * Port from cline/cline#11514: encourage parallel tool calls Add a universal system-prompt guidance block telling the model to batch independent tool calls (reads, searches, web fetches, read-only commands) into a single assistant turn instead of one call per turn. The runtime already executes independent batches concurrently (read-only tools always; non-overlapping path-scoped file ops); the open-source system prompt had nothing steering the model to PRODUCE the batch. Fewer round-trips means less resent context, which compounds over a long conversation. - prompt_builder.py: new PARALLEL_TOOL_CALL_GUIDANCE block (short, static, cache-amortised) modeled on TASK_COMPLETION_GUIDANCE. - system_prompt.py: inject right after the task-completion block, gated by agent.valid_tool_names + the new toggle. - agent_init.py: read agent.parallel_tool_call_guidance (default True). - config.py: add the default under the agent section. - test_prompt_builder.py: behavior-contract tests (batching steer, dependent carve-out, length bound) — invariants, not wording snapshots. Adapted from Cline's TypeScript tool-surface guidance to hermes-agent's Python prompt-assembly architecture and config-over-env conventions. * fix(desktop): never persist or restore a named custom provider as bare "custom" Custom providers vanish from the Desktop/TUI model picker with "No LLM provider configured" — repeatedly fixed (#44062, #44109, #45578) and repeatedly regressed (#44022, #47714) because every fix only recovered the entry identity from a persisted base_url. When a session is persisted/restored with the resolved provider "custom" and NO base_url, bare "custom" leaked through verbatim; resolve_runtime_provider("custom") routes to the OpenRouter default URL with no api_key, so the next turn/resume dies. Bare "custom" is the resolved billing class shared by every named providers:/ custom_providers: entry — it is not a routable identity. Centralize the "never let bare custom escape" invariant in one helper, runtime_provider.canonical_custom_identity(), and apply it at all four leak sites in tui_gateway/server.py: - _ensure_session_db_row — the ORIGIN: first DB write seeds the bad row - _runtime_model_config — live persist - _stored_session_runtime_overrides — resume restore (heals old rows; drops unrecoverable bare custom so resume falls back to config default) - _make_agent — rebuild / per-turn The helper recovers custom:<name> from the endpoint URL when present, else from config.model.provider (the durable identity left when no base_url survived). Regression tests in test_custom_provider_session_persistence.py lock the no-base_url vector at every site so it cannot regress again.	2026-06-18 11:11:51 -07:00
Teknium	38c8a9c10f	feat(memory): batch operations for single-turn memory updates (#48507 ) The memory tool was strictly one-op-per-call. With the store running near its char limit by design, a new add that would overflow gets rejected with 'consolidate now, then retry' -- but the model could not consolidate and add in one call. It had to remove/replace across several turns, then retry the add, each turn re-sending the whole conversation context. Expensive thrash. Add an 'operations' array: a list of add/replace/remove ops applied atomically against the FINAL char budget. The model frees space and adds new entries in ONE call, even when an add alone would overflow. All-or-nothing: any bad op aborts the whole batch, nothing written. Root-cause note: the two agent-level memory interception sites (agent_runtime_helpers.py, tool_executor.py) silently dropped any param not in their explicit kwarg list, so 'operations' never reached the handler and batch calls failed with 'Unknown action None'. Both now pass it through and bridge each add/replace op to external memory providers. Also: success response is now terminal (done=true + 'do not repeat' note, no full-entries echo that invited re-edits); schema rewritten to lead with the batch mechanism and an explicit one-shot stop rule (2138 -> 1476 chars). Live-verified: near-full consolidate-and-add went 7 calls -> 1 call, stable across 3 reps. 103 memory/approval tests + 398 background-review/ run_agent tests green; 6 new batch tests added.	2026-06-18 10:19:33 -07:00
kshitij	2fa16ec2d2	Merge pull request #48529 from kshitijk4poor/salvage-48372-eap fix(install): relax EAP=Stop around native git/uv calls + fail-fast on uv venv failure (#48352, salvage of #48372)	2026-06-18 22:17:53 +05:30
kshitijk4poor	fd12e59e6b	fix(install): fail fast when uv venv genuinely fails under relaxed EAP PR #48372 relaxes EAP=Stop around the uv venv call so PowerShell 5.1 doesn't mistake uv's 'Using CPython ...' stderr for a terminating NativeCommandError. But relaxing EAP also means a genuine uv venv failure (exit != 0) no longer aborts on its own — Install-Venv would continue and print 'Virtual environment ready', and in stage mode Invoke-Stage would report ok=true, even though no venv was created. Capture $LASTEXITCODE immediately after the relaxed call and throw on non-zero (Pop-Location first, matching the function's other exit paths), so the venv stage fails fast instead of falsely succeeding. This is the explicit guard originally proposed in #48463 (devorun), composed on top of #48372's reusable helper + regression test. Adds a regression test asserting the uv venv exit-code capture + throw.	2026-06-18 22:11:35 +05:30
Teknium	c37fdec2d9	feat(dashboard): surface full per-MCP catalog detail; fix pip-install doc (#48520 ) The dashboard MCP catalog only showed name/description/transport and a non-clickable source. Users couldn't see what an entry connects to or runs before installing — the exact detail the docs trust model tells them to vet. - /api/mcp/catalog now returns transport target (url, or command+args), auth_type, git install source/ref + bootstrap commands, default-enabled tool hint, and post-install guidance per entry. - McpPage renders the endpoint URL (http) or command+args (stdio), the git install source/ref, a collapsible bootstrap-commands list, setup notes, and the source as a clickable link when it's a URL. - Docs: drop the 'uv pip install -e .[mcp]' quick-start step (Hermes does not support pip installs; MCP ships with the standard install) and note the dashboard now surfaces this detail. - Strengthen the catalog endpoint test to assert the new inspection fields.	2026-06-18 09:40:56 -07:00
kshitij	4af16b5da2	Merge pull request #48206 from ehz0ah/fix/openviking-current-api-rebased fix(openviking): adapt memory provider for current api	2026-06-18 21:53:42 +05:30
teknium1	5ffbfed193	feat(mcp-catalog): add official Unreal Engine 5.8 MCP server Epic's experimental Unreal MCP plugin embeds an MCP server inside the Unreal Editor process, served over local HTTP (127.0.0.1:8000/mcp by default). HTTP transport, no auth, no install block — the user enables the plugin in-editor and Hermes connects to the URL. Also drops test_optional_mcps_manifests_ship_in_both_wheel_and_sdist: it asserted wheel/sdist packaging targets for pip/Homebrew/Nix installs, which Hermes does not support — installs run from the repo checkout, where the catalog is discovered by directory iteration with no packaging step.	2026-06-18 09:16:40 -07:00
xxxigm	58ad6942d9	fix(tui): don't make Enter swallow trailing-space-only slash completions (#48425 ) * fix(tui): don't make Enter swallow trailing-space-only slash completions Submitting a slash command in the TUI took three Enter presses: one to complete the name (/ex → /exit), a second that only appended the trailing space the gateway adds to keep the classic-CLI prompt_toolkit dropdown open (/exit → "/exit "), and a third to actually submit. The composer's submit handler accepted the highlighted completion whenever applying it changed the input at all, so the whitespace-only delta ate an extra keypress. Treat a completion whose only change is trailing whitespace on an already-complete token as "already complete" and fall through to submit. Partial-name and argument completions (a real token change) still accept on Enter as before. The replace/accept logic is extracted into pure helpers (applyCompletion, completionToApplyOnSubmit) in domain/slash.ts. * test(tui): cover Enter/completion trailing-space behavior and isolate poller queue - completionApply.test.ts asserts completionToApplyOnSubmit accepts real token completions (partial command name, argument) but returns null for a trailing-space-only delta on an already-complete command, so Enter submits instead of needing extra presses. - test_notification_poller_delivers_completion / _skips_consumed previously shared the process-global process_registry.completion_queue. Their events carry no session_key, so a leaked/concurrent poller could dequeue and dispatch them to a fixture agent without run_conversation, flaking CI ("AttributeError: '_FakeAgent' object has no attribute 'run_conversation'"). Isolate the queue per test (fresh queue.Queue via monkeypatch), matching the sibling poller tests that already do this.	2026-06-18 11:04:59 -05:00
Teknium	25c590ccd0	fix(skills): refuse SKILLS_DIR root in rmtree guard, not just outside-tree The salvaged guard allowed _rmtree_writable(SKILLS_DIR) itself. No call site ever passes the root — every site passes a skill subdir or its .bak sibling — so allowing the root only preserves the #48200 footgun (a dest that collapses to the root wipes every installed skill). Require a strict strict-child relationship and update the test that documented the nonexistent 'full reset' capability.	2026-06-18 08:53:35 -07:00
Kewe63	f1254c8eaf	fix(skills): rmtree scope guard + default pre_update_backup to true (#48200 ) Defense-in-depth fix for the silent wipe of ~/.hermes/ documented in #48200. A `hermes update --yes` run silently destroyed a user's .env, MEMORY.md, kanban.db, custom skills, and scripts. Two changes: 1. `_rmtree_writable` in tools/skills_sync.py now refuses to rmtree anything outside SKILLS_DIR (the HERMES_HOME/skills/ root). All five call sites pass paths under SKILLS_DIR, so the guard is a no-op for current code and a loud, recoverable failure for any future regression (bad path join, malicious bundled manifest, stale path in scope after an exception). 2. The default `updates.pre_update_backup` flips from false to true in hermes_cli/config.py. A few minutes of zip per update is negligible compared to silent total data loss. Still overridable; --no-backup still works for one-off opt-out. Five new tests in TestRmtreeWritableScopeGuard (root path, hermes home, sibling dir, skills root itself, subdir) plus a flipped `test_default_enabled_creates_backup` in test_backup.py. 178/178 tests pass in the two affected files. Public method signatures unchanged, no test-stub blast radius. Closes #48200	2026-06-18 08:53:35 -07:00
Teknium	41babc702e	chore(release): map iamlukethedev to AUTHOR_MAP	2026-06-18 08:53:31 -07:00
Luke The Dev	3c3ac19d9c	fix(#37878 ): Address review feedback — fix trailing whitespace and add ANTHROPIC_API_KEY test Review feedback from egilewski: 1. Remove trailing whitespace from test docstring and mock patches (lines 1430, 1469, 1476, 1482) 2. Expand test coverage: also verify ANTHROPIC_API_KEY is stripped (not just OPENAI_API_KEY) Changes: - Remove trailing whitespace from test file - Add ANTHROPIC_API_KEY to test environment - Add assertion verifying ANTHROPIC_API_KEY is stripped from cua-driver subprocess env - Syntax verified: python3 -m py_compile tests/tools/test_computer_use.py ✓	2026-06-18 08:53:31 -07:00
Luke The Dev	2e5c04aaf7	fix(#37878 ): scrub operator environment before launching cua-driver MCP - Use _sanitize_subprocess_env() to filter Hermes-managed credentials from the cua-driver subprocess environment (issue #37878) - Prevents credential exfiltration to the third-party cua-driver binary - Aligns with existing pattern used by browser-tool and other tools - Add regression test to verify environment sanitization The cua-driver is a lower-trust MCP subprocess per SECURITY.md §2.3. Its inherited environment is now scrubbed by default, removing provider API keys, gateway tokens, and platform credentials that should not leak to third-party binaries. Fixes #37878	2026-06-18 08:53:31 -07:00
kshitij	b39ec2fc37	Merge pull request #48341 from xxxigm/fix/install-ps1-powershell-host-resolution fix(install): resolve PowerShell host instead of bare `powershell` for uv install	2026-06-18 21:09:50 +05:30
Siddharth Balyan	646cd1b43e	fix(nix): refresh npmDepsHash after the Electron 40.10.2 pin (#47792 ) (#48457 ) PR #47792 pinned Electron to an exact 40.10.2 and regenerated the root package-lock.json (dropping @electron/get@5 + @electron-internal/extract-zip, restoring @electron/get@2 + extract-zip@2 + yauzl), but did not refresh the shared npmDepsHash in nix/lib.nix. The hash still described the previous 40.10.3 lockfile, so npmConfigHook fails on every Nix build with "npmDepsHash is out of date" for hermes-tui / hermes-web / hermes-desktop. Regenerate the single shared hash to match the current lockfile. Verified with fetchNpmDeps (authoritative, not prefetch-npm-deps): nix build .#tui.npmDeps -> builds clean nix build .#tui -> Validating consistency -> Installing dependencies -> Finished npmConfigHook (no hash error)	2026-06-18 15:00:08 +00:00
teknium1	ef4b897a18	chore(release): map srojk34 author email	2026-06-18 05:55:17 -07:00
srojk34	92e6d8c858	fix(desktop): dispose open PTY sessions in before-quit handler The `before-quit` handler tears down the bootstrap controller, preview watchers, and the Python backend but never disposes live PTY sessions. When `app.quit()` proceeds to `FreeEnvironment()`, node-pty's `ThreadSafeFunction::CallJS` callback fires on a half-torn-down environment, throws a C++ exception that can no longer be caught, and the process aborts (microsoft/node-pty#904). Iterate `terminalSessions` and call `disposeTerminalSession()` (which already calls `pty.kill()` + deletes the map entry) before killing the backend, so the ThreadSafeFunctions are removed before teardown begins. Closes #48335	2026-06-18 05:55:17 -07:00
Teknium	2f7c4858a7	fix(tui): refresh tool snapshot when MCP discovery lands after agent build (#48403 ) The TUI banner reported fewer tools than the classic CLI for the same config (e.g. 32 vs 38) when an MCP server connected slowly. Root cause: the agent snapshots `agent.tools` once at build time and never re-reads the registry. `_make_agent` briefly joins the background MCP discovery thread (`wait_for_mcp_discovery`, ~0.75s) so fast servers land in that snapshot, but a server slower than the bound — common for an HTTP MCP server on first connect — lands after the agent is built. Its tools are then absent from both the agent (uncallable until `/reload-mcp`) and the banner for the whole session. The classic CLI doesn't hit this because it re-derives `get_tool_definitions()` at banner render time (which re-waits for discovery), so it picks the late tools up. Fix: after a fresh agent is built and its first `session.info` emitted, if discovery is still in flight, schedule an off-critical-path daemon that waits for it to finish, then rebuilds the tool snapshot and re-emits `session.info` — the same rebuild `/reload-mcp` performs, but automatic. Both the agent's callable tools and the banner count catch up. Cache safety: the rebuild runs only while the session is still pre-first-turn (`_user_turn_count`/`_api_call_count` both 0 → nothing cached to invalidate). Once the user has sent a message we leave the snapshot frozen rather than break the cached prompt prefix mid-conversation; late tools then require an explicit `/reload-mcp` (user-consented), exactly as today. No-op when discovery finished before the agent build, when the join times out, when the registry was unchanged, or when the session was swapped/closed while waiting. Adds entry.mcp_discovery_in_flight() / join_mcp_discovery() accessors and covers the matrix (added/none/post-turn/timeout/unchanged/replaced) with unit tests.	2026-06-18 05:41:23 -07:00
Teknium	8abdab24c9	fix(tui): MCP headline counts connected servers, not disabled ones (#48402 ) The TUI banner footer used the raw `info.mcp_servers.length`, so a configured-but-disabled server (e.g. `linear`) was counted alongside connected ones. With a disabled `linear` and a connected `nous-support`, the TUI reported "2 MCP" while the classic CLI correctly reported "1 MCP" (`mcp_connected = sum(1 for s in mcp_status if s["connected"])` in hermes_cli/banner.py). The collapse toggle even labels the count "connected", which was wrong for the same reason. Count connected servers for both the toggle and the footer segment, and drop the `· N MCP` segment entirely when none are connected (matching the classic banner, which only appends it when the count is > 0). The expandable MCP section still lists every configured server, including disabled ones. Invariant test renders SessionPanel and asserts the headline equals the connected count, never the configured total.	2026-06-18 05:41:19 -07:00
Tranquil-Flow	67316fdc94	fix(install): relax native stderr handling in install.ps1 (#48352 )	2026-06-18 12:06:29 +02:00
xxxigm	feff283e17	test(install): lock uv installer to a resolved PowerShell host Source-level guard (install.ps1 only runs on Windows, so there's no Linux CI runner to execute it): the astral uv install line must be invoked via the call operator on a resolved host variable, the bare-`powershell` literal that produced the field-reported "The term 'powershell' is not recognized" must be gone, and the resolver must be PATH-independent (Get-Process -Id $PID) and pwsh-aware.	2026-06-18 16:26:34 +07:00
xxxigm	a14bae6bcc	fix(install): resolve PowerShell host instead of bare `powershell` for uv The Windows installer's Install-Uv spawned the astral uv installer with a hardcoded bare `powershell -ExecutionPolicy ByPass -c "irm .../uv \| iex"`. That name resolves only to Windows PowerShell, and only when its System32 directory is on PATH. Run under PowerShell 7+ (`pwsh`) — or any session where `powershell` isn't on PATH — the spawn dies with "The term 'powershell' is not recognized", and uv installation aborts (the installer then appears stuck). Add Get-PowerShellHostExe, which prefers the absolute path of the host we're already running in (PATH-independent), then falls back to powershell/pwsh via Get-Command, then to the bare name. Install-Uv now invokes that resolved exe.	2026-06-18 16:26:34 +07:00
qin-ctx	2a5d51c16e	fix(openviking): adapt memory provider for current api (cherry picked from commit `cbb87389f3`)	2026-06-18 16:58:11 +08:00
kshitij	426f321e84	Merge pull request #48299 from NousResearch/chore/author-map-infinitycrew39 Some checks failed Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run Details Typecheck / typecheck (apps/desktop) (push) Waiting to run Details Typecheck / typecheck (apps/shared) (push) Waiting to run Details Typecheck / typecheck (ui-tui) (push) Waiting to run Details Typecheck / typecheck (web) (push) Waiting to run Details Typecheck / desktop-build (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details Docker / shell lint / Lint Dockerfile (hadolint) (push) Has been cancelled Details Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Has been cancelled Details chore(release): map infinitycrew39 author email	2026-06-18 13:09:59 +05:30
kshitijk4poor	ca28c630c7	chore(release): map infinitycrew39 author email Add infinitycrew39@gmail.com -> infinitycrew39 to AUTHOR_MAP so the contributor audit resolves the two cherry-picked commits from the #47945 langfuse trace-scope salvage (merged as #48292) to a GitHub handle instead of flagging them as an unmapped author email.	2026-06-18 13:09:34 +05:30
kshitij	9b2f7d2cb1	Merge pull request #48292 from NousResearch/fix/langfuse-trace-scope-salvage fix(langfuse): scope trace state by turn/request ids (salvage #47945)	2026-06-18 13:08:17 +05:30
kshitijk4poor	0787ea07c8	test(langfuse): pin exact surviving key in turn-isolation test The prior assertion `all("turn1" in k or "turn2" in k for k in keys)` was weak on two counts: it passes vacuously when keys is empty (a regression that lost all state would slip through), and after turn 2 finalizes only turn 1 lingers, so it only ever inspected turn 1 anyway. Replace it with an exact check that one key survives, it is turn 1, and turn 2 never merged into it — the real isolation invariant the test name claims.	2026-06-18 13:00:01 +05:30
kshitijk4poor	f4fbaa6cda	fix(langfuse): bound _TRACE_STATE growth from non-finalizing turns Scoping the trace key by turn_id (the prior commit) fixed cross-turn collisions but introduced a slow leak: _finish_trace only pops a key when a turn ends cleanly (final response has content and no tool calls), so any turn that is interrupted, ends on a tool call, or has empty final content now leaves its uniquely-keyed entry in _TRACE_STATE forever. Previously the constant per-session key was overwritten by the next turn, capping growth at ~1 entry per session. Add an LRU cap (_MAX_TRACE_STATE) enforced by _evict_stale_locked, called under _STATE_LOCK immediately before each insert. It evicts the least-recently-updated entries (using the previously-dead last_updated_at field) and ends their root span so nothing dangles. Regression test drives 50 non-finalizing turns against a cap of 8 and asserts the dict stays bounded with the most-recent turns surviving.	2026-06-18 12:59:41 +05:30
kshitijk4poor	e1d10ec1ed	refactor(langfuse): extract _scope_prefix from _trace_key The turn- and api-scoped branches each repeated the same task/session/thread fallback ladder with only the infix differing. Extract the shared prefix into _scope_prefix so a future scope dimension touches one ladder instead of three. The legacy branch still returns a bare task_id (not the task: prefix) for backward compatibility, so it stays separate. Output key strings are unchanged; a new test pins them across every task/session/turn/api combination since the keys are matched across hooks and any drift would silently break trace finalization.	2026-06-18 12:58:24 +05:30
kshitij	860cf5133a	Merge pull request #48293 from kshitijk4poor/chore/skills-diff-cleanup refactor(skills): dedupe file-listing + share user-modified predicate (follow-up to #48286)	2026-06-18 12:49:53 +05:30
kshitijk4poor	f6fac60e66	refactor(skills): dedupe file-listing, share user-modified predicate, trim diff contract Cleanup pass on the salvage (behavior-preserving): - diff_bundled_skill now uses the existing _skill_file_list() helper instead of reimplementing the rglob/is_file/relative_to file-set enumeration inline (twice). - Extract _is_tracked_user_modification(origin_hash, user_hash) and use it in BOTH the sync loop and list_user_modified_bundled_skills() so the 'kept user edit' rule can't drift between the two sites. - _read_text_for_diff -> _read_for_diff returns (bytes, text); the binary branch now compares the bytes it already read instead of re-reading both files from disk. - Drop the unused 'user_present' key from diff_bundled_skill's return contract (no consumer or test ever read it). - test_update_modified_notice: drop the brittle '>= 2 sites' count-floor so consolidating the two print paths into a shared helper stays a welcome refactor; keep the per-site 'count notice => discovery hint' invariant (still mutation-tested).	2026-06-18 12:42:58 +05:30
kshitijk4poor	b4356135f2	test(langfuse): add end-to-end turn-isolation regression The PR added helper-level tests for _trace_key but nothing exercised the keys through the real hooks. This adds TestTurnTraceIsolation, which drives on_pre_llm_request / on_post_llm_call across two turns of one gateway session (task_id == session_id, unique turn_id, api_call_count reset per turn) and asserts each turn opens its own root trace when the first turn fails to finalize (tool-only final step). This test fails on the pre-fix code (only one trace opened, turn 2 absorbed into turn 1) and passes with the scoping fix. Also pins the turn_id-over-api_request_id key precedence: the turn-scoped post_llm_call carries no api_request_id, so it must still resolve to the same key as the request-scoped hooks or finalization breaks.	2026-06-18 12:38:44 +05:30
infinitycrew39	40ed67ccfe	test(langfuse): cover turn/api trace-key scoping	2026-06-18 12:36:35 +05:30
infinitycrew39	0b54a33a34	fix(langfuse): scope trace state by turn/request ids	2026-06-18 12:36:35 +05:30
kshitij	737007e335	Merge pull request #48286 from kshitijk4poor/salvage/skills-list-modified-diff feat(skills): find & diff user-modified bundled skills (salvage of #47802)	2026-06-18 12:33:28 +05:30

1 2 3 4 5 ...

12013 commits