hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-21 10:22:18 +00:00

Author	SHA1	Message	Date
Teknium	38c8a9c10f	feat(memory): batch operations for single-turn memory updates (#48507 ) The memory tool was strictly one-op-per-call. With the store running near its char limit by design, a new add that would overflow gets rejected with 'consolidate now, then retry' -- but the model could not consolidate and add in one call. It had to remove/replace across several turns, then retry the add, each turn re-sending the whole conversation context. Expensive thrash. Add an 'operations' array: a list of add/replace/remove ops applied atomically against the FINAL char budget. The model frees space and adds new entries in ONE call, even when an add alone would overflow. All-or-nothing: any bad op aborts the whole batch, nothing written. Root-cause note: the two agent-level memory interception sites (agent_runtime_helpers.py, tool_executor.py) silently dropped any param not in their explicit kwarg list, so 'operations' never reached the handler and batch calls failed with 'Unknown action None'. Both now pass it through and bridge each add/replace op to external memory providers. Also: success response is now terminal (done=true + 'do not repeat' note, no full-entries echo that invited re-edits); schema rewritten to lead with the batch mechanism and an explicit one-shot stop rule (2138 -> 1476 chars). Live-verified: near-full consolidate-and-add went 7 calls -> 1 call, stable across 3 reps. 103 memory/approval tests + 398 background-review/ run_agent tests green; 6 new batch tests added.	2026-06-18 10:19:33 -07:00
kshitij	2fa16ec2d2	Merge pull request #48529 from kshitijk4poor/salvage-48372-eap fix(install): relax EAP=Stop around native git/uv calls + fail-fast on uv venv failure (#48352, salvage of #48372)	2026-06-18 22:17:53 +05:30
kshitijk4poor	fd12e59e6b	fix(install): fail fast when uv venv genuinely fails under relaxed EAP PR #48372 relaxes EAP=Stop around the uv venv call so PowerShell 5.1 doesn't mistake uv's 'Using CPython ...' stderr for a terminating NativeCommandError. But relaxing EAP also means a genuine uv venv failure (exit != 0) no longer aborts on its own — Install-Venv would continue and print 'Virtual environment ready', and in stage mode Invoke-Stage would report ok=true, even though no venv was created. Capture $LASTEXITCODE immediately after the relaxed call and throw on non-zero (Pop-Location first, matching the function's other exit paths), so the venv stage fails fast instead of falsely succeeding. This is the explicit guard originally proposed in #48463 (devorun), composed on top of #48372's reusable helper + regression test. Adds a regression test asserting the uv venv exit-code capture + throw.	2026-06-18 22:11:35 +05:30
Teknium	c37fdec2d9	feat(dashboard): surface full per-MCP catalog detail; fix pip-install doc (#48520 ) The dashboard MCP catalog only showed name/description/transport and a non-clickable source. Users couldn't see what an entry connects to or runs before installing — the exact detail the docs trust model tells them to vet. - /api/mcp/catalog now returns transport target (url, or command+args), auth_type, git install source/ref + bootstrap commands, default-enabled tool hint, and post-install guidance per entry. - McpPage renders the endpoint URL (http) or command+args (stdio), the git install source/ref, a collapsible bootstrap-commands list, setup notes, and the source as a clickable link when it's a URL. - Docs: drop the 'uv pip install -e .[mcp]' quick-start step (Hermes does not support pip installs; MCP ships with the standard install) and note the dashboard now surfaces this detail. - Strengthen the catalog endpoint test to assert the new inspection fields.	2026-06-18 09:40:56 -07:00
kshitij	4af16b5da2	Merge pull request #48206 from ehz0ah/fix/openviking-current-api-rebased fix(openviking): adapt memory provider for current api	2026-06-18 21:53:42 +05:30
teknium1	5ffbfed193	feat(mcp-catalog): add official Unreal Engine 5.8 MCP server Epic's experimental Unreal MCP plugin embeds an MCP server inside the Unreal Editor process, served over local HTTP (127.0.0.1:8000/mcp by default). HTTP transport, no auth, no install block — the user enables the plugin in-editor and Hermes connects to the URL. Also drops test_optional_mcps_manifests_ship_in_both_wheel_and_sdist: it asserted wheel/sdist packaging targets for pip/Homebrew/Nix installs, which Hermes does not support — installs run from the repo checkout, where the catalog is discovered by directory iteration with no packaging step.	2026-06-18 09:16:40 -07:00
xxxigm	58ad6942d9	fix(tui): don't make Enter swallow trailing-space-only slash completions (#48425 ) * fix(tui): don't make Enter swallow trailing-space-only slash completions Submitting a slash command in the TUI took three Enter presses: one to complete the name (/ex → /exit), a second that only appended the trailing space the gateway adds to keep the classic-CLI prompt_toolkit dropdown open (/exit → "/exit "), and a third to actually submit. The composer's submit handler accepted the highlighted completion whenever applying it changed the input at all, so the whitespace-only delta ate an extra keypress. Treat a completion whose only change is trailing whitespace on an already-complete token as "already complete" and fall through to submit. Partial-name and argument completions (a real token change) still accept on Enter as before. The replace/accept logic is extracted into pure helpers (applyCompletion, completionToApplyOnSubmit) in domain/slash.ts. * test(tui): cover Enter/completion trailing-space behavior and isolate poller queue - completionApply.test.ts asserts completionToApplyOnSubmit accepts real token completions (partial command name, argument) but returns null for a trailing-space-only delta on an already-complete command, so Enter submits instead of needing extra presses. - test_notification_poller_delivers_completion / _skips_consumed previously shared the process-global process_registry.completion_queue. Their events carry no session_key, so a leaked/concurrent poller could dequeue and dispatch them to a fixture agent without run_conversation, flaking CI ("AttributeError: '_FakeAgent' object has no attribute 'run_conversation'"). Isolate the queue per test (fresh queue.Queue via monkeypatch), matching the sibling poller tests that already do this.	2026-06-18 11:04:59 -05:00
Teknium	25c590ccd0	fix(skills): refuse SKILLS_DIR root in rmtree guard, not just outside-tree The salvaged guard allowed _rmtree_writable(SKILLS_DIR) itself. No call site ever passes the root — every site passes a skill subdir or its .bak sibling — so allowing the root only preserves the #48200 footgun (a dest that collapses to the root wipes every installed skill). Require a strict strict-child relationship and update the test that documented the nonexistent 'full reset' capability.	2026-06-18 08:53:35 -07:00
Kewe63	f1254c8eaf	fix(skills): rmtree scope guard + default pre_update_backup to true (#48200 ) Defense-in-depth fix for the silent wipe of ~/.hermes/ documented in #48200. A `hermes update --yes` run silently destroyed a user's .env, MEMORY.md, kanban.db, custom skills, and scripts. Two changes: 1. `_rmtree_writable` in tools/skills_sync.py now refuses to rmtree anything outside SKILLS_DIR (the HERMES_HOME/skills/ root). All five call sites pass paths under SKILLS_DIR, so the guard is a no-op for current code and a loud, recoverable failure for any future regression (bad path join, malicious bundled manifest, stale path in scope after an exception). 2. The default `updates.pre_update_backup` flips from false to true in hermes_cli/config.py. A few minutes of zip per update is negligible compared to silent total data loss. Still overridable; --no-backup still works for one-off opt-out. Five new tests in TestRmtreeWritableScopeGuard (root path, hermes home, sibling dir, skills root itself, subdir) plus a flipped `test_default_enabled_creates_backup` in test_backup.py. 178/178 tests pass in the two affected files. Public method signatures unchanged, no test-stub blast radius. Closes #48200	2026-06-18 08:53:35 -07:00
Luke The Dev	3c3ac19d9c	fix(#37878 ): Address review feedback — fix trailing whitespace and add ANTHROPIC_API_KEY test Review feedback from egilewski: 1. Remove trailing whitespace from test docstring and mock patches (lines 1430, 1469, 1476, 1482) 2. Expand test coverage: also verify ANTHROPIC_API_KEY is stripped (not just OPENAI_API_KEY) Changes: - Remove trailing whitespace from test file - Add ANTHROPIC_API_KEY to test environment - Add assertion verifying ANTHROPIC_API_KEY is stripped from cua-driver subprocess env - Syntax verified: python3 -m py_compile tests/tools/test_computer_use.py ✓	2026-06-18 08:53:31 -07:00
Luke The Dev	2e5c04aaf7	fix(#37878 ): scrub operator environment before launching cua-driver MCP - Use _sanitize_subprocess_env() to filter Hermes-managed credentials from the cua-driver subprocess environment (issue #37878) - Prevents credential exfiltration to the third-party cua-driver binary - Aligns with existing pattern used by browser-tool and other tools - Add regression test to verify environment sanitization The cua-driver is a lower-trust MCP subprocess per SECURITY.md §2.3. Its inherited environment is now scrubbed by default, removing provider API keys, gateway tokens, and platform credentials that should not leak to third-party binaries. Fixes #37878	2026-06-18 08:53:31 -07:00
kshitij	b39ec2fc37	Merge pull request #48341 from xxxigm/fix/install-ps1-powershell-host-resolution fix(install): resolve PowerShell host instead of bare `powershell` for uv install	2026-06-18 21:09:50 +05:30
Teknium	2f7c4858a7	fix(tui): refresh tool snapshot when MCP discovery lands after agent build (#48403 ) The TUI banner reported fewer tools than the classic CLI for the same config (e.g. 32 vs 38) when an MCP server connected slowly. Root cause: the agent snapshots `agent.tools` once at build time and never re-reads the registry. `_make_agent` briefly joins the background MCP discovery thread (`wait_for_mcp_discovery`, ~0.75s) so fast servers land in that snapshot, but a server slower than the bound — common for an HTTP MCP server on first connect — lands after the agent is built. Its tools are then absent from both the agent (uncallable until `/reload-mcp`) and the banner for the whole session. The classic CLI doesn't hit this because it re-derives `get_tool_definitions()` at banner render time (which re-waits for discovery), so it picks the late tools up. Fix: after a fresh agent is built and its first `session.info` emitted, if discovery is still in flight, schedule an off-critical-path daemon that waits for it to finish, then rebuilds the tool snapshot and re-emits `session.info` — the same rebuild `/reload-mcp` performs, but automatic. Both the agent's callable tools and the banner count catch up. Cache safety: the rebuild runs only while the session is still pre-first-turn (`_user_turn_count`/`_api_call_count` both 0 → nothing cached to invalidate). Once the user has sent a message we leave the snapshot frozen rather than break the cached prompt prefix mid-conversation; late tools then require an explicit `/reload-mcp` (user-consented), exactly as today. No-op when discovery finished before the agent build, when the join times out, when the registry was unchanged, or when the session was swapped/closed while waiting. Adds entry.mcp_discovery_in_flight() / join_mcp_discovery() accessors and covers the matrix (added/none/post-turn/timeout/unchanged/replaced) with unit tests.	2026-06-18 05:41:23 -07:00
Tranquil-Flow	67316fdc94	fix(install): relax native stderr handling in install.ps1 (#48352 )	2026-06-18 12:06:29 +02:00
xxxigm	feff283e17	test(install): lock uv installer to a resolved PowerShell host Source-level guard (install.ps1 only runs on Windows, so there's no Linux CI runner to execute it): the astral uv install line must be invoked via the call operator on a resolved host variable, the bare-`powershell` literal that produced the field-reported "The term 'powershell' is not recognized" must be gone, and the resolver must be PATH-independent (Get-Process -Id $PID) and pwsh-aware.	2026-06-18 16:26:34 +07:00
qin-ctx	2a5d51c16e	fix(openviking): adapt memory provider for current api (cherry picked from commit `cbb87389f3`)	2026-06-18 16:58:11 +08:00
kshitij	9b2f7d2cb1	Merge pull request #48292 from NousResearch/fix/langfuse-trace-scope-salvage fix(langfuse): scope trace state by turn/request ids (salvage #47945)	2026-06-18 13:08:17 +05:30
kshitijk4poor	0787ea07c8	test(langfuse): pin exact surviving key in turn-isolation test The prior assertion `all("turn1" in k or "turn2" in k for k in keys)` was weak on two counts: it passes vacuously when keys is empty (a regression that lost all state would slip through), and after turn 2 finalizes only turn 1 lingers, so it only ever inspected turn 1 anyway. Replace it with an exact check that one key survives, it is turn 1, and turn 2 never merged into it — the real isolation invariant the test name claims.	2026-06-18 13:00:01 +05:30
kshitijk4poor	f4fbaa6cda	fix(langfuse): bound _TRACE_STATE growth from non-finalizing turns Scoping the trace key by turn_id (the prior commit) fixed cross-turn collisions but introduced a slow leak: _finish_trace only pops a key when a turn ends cleanly (final response has content and no tool calls), so any turn that is interrupted, ends on a tool call, or has empty final content now leaves its uniquely-keyed entry in _TRACE_STATE forever. Previously the constant per-session key was overwritten by the next turn, capping growth at ~1 entry per session. Add an LRU cap (_MAX_TRACE_STATE) enforced by _evict_stale_locked, called under _STATE_LOCK immediately before each insert. It evicts the least-recently-updated entries (using the previously-dead last_updated_at field) and ends their root span so nothing dangles. Regression test drives 50 non-finalizing turns against a cap of 8 and asserts the dict stays bounded with the most-recent turns surviving.	2026-06-18 12:59:41 +05:30
kshitijk4poor	e1d10ec1ed	refactor(langfuse): extract _scope_prefix from _trace_key The turn- and api-scoped branches each repeated the same task/session/thread fallback ladder with only the infix differing. Extract the shared prefix into _scope_prefix so a future scope dimension touches one ladder instead of three. The legacy branch still returns a bare task_id (not the task: prefix) for backward compatibility, so it stays separate. Output key strings are unchanged; a new test pins them across every task/session/turn/api combination since the keys are matched across hooks and any drift would silently break trace finalization.	2026-06-18 12:58:24 +05:30
kshitijk4poor	f6fac60e66	refactor(skills): dedupe file-listing, share user-modified predicate, trim diff contract Cleanup pass on the salvage (behavior-preserving): - diff_bundled_skill now uses the existing _skill_file_list() helper instead of reimplementing the rglob/is_file/relative_to file-set enumeration inline (twice). - Extract _is_tracked_user_modification(origin_hash, user_hash) and use it in BOTH the sync loop and list_user_modified_bundled_skills() so the 'kept user edit' rule can't drift between the two sites. - _read_text_for_diff -> _read_for_diff returns (bytes, text); the binary branch now compares the bytes it already read instead of re-reading both files from disk. - Drop the unused 'user_present' key from diff_bundled_skill's return contract (no consumer or test ever read it). - test_update_modified_notice: drop the brittle '>= 2 sites' count-floor so consolidating the two print paths into a shared helper stays a welcome refactor; keep the per-site 'count notice => discovery hint' invariant (still mutation-tested).	2026-06-18 12:42:58 +05:30
kshitijk4poor	b4356135f2	test(langfuse): add end-to-end turn-isolation regression The PR added helper-level tests for _trace_key but nothing exercised the keys through the real hooks. This adds TestTurnTraceIsolation, which drives on_pre_llm_request / on_post_llm_call across two turns of one gateway session (task_id == session_id, unique turn_id, api_call_count reset per turn) and asserts each turn opens its own root trace when the first turn fails to finalize (tool-only final step). This test fails on the pre-fix code (only one trace opened, turn 2 absorbed into turn 1) and passes with the scoping fix. Also pins the turn_id-over-api_request_id key precedence: the turn-scoped post_llm_call carries no api_request_id, so it must still resolve to the same key as the request-scoped hooks or finalization breaks.	2026-06-18 12:38:44 +05:30
infinitycrew39	40ed67ccfe	test(langfuse): cover turn/api trace-key scoping	2026-06-18 12:36:35 +05:30
kshitijk4poor	6777916068	fix(skills): surface list-modified hint on both update paths + disambiguate diff Salvage follow-up to the cherry-picked feat/test commits: - W1: the unpack/install update path in main.py printed the '~ N user-modified (kept)' notice without the new 'hermes skills list-modified' hint that the git-pull path got. Mirror the hint to both sites so the count is actionable regardless of which update path runs. - W2: 'hermes skills diff <name>' (bundled-vs-stock) now shares the verb with the gateway write-approval 'diff <id>'. The gateway handler's docstring + truncation message pointed users to '/skills diff <id>' on the CLI, which now resolves a bundled skill by that name instead. Point at the pending JSON file and note the two diff commands are distinct. - Add an invariant test asserting every 'user-modified (kept)' notice in main.py carries the discovery hint (guards sibling drift).	2026-06-18 12:28:11 +05:30
xxxigm	481f0417d8	test(skills): cover list-modified + diff for bundled skills Exercises the real sync pipeline (no mocked comparison logic): a pristine synced skill is not flagged; an edited one is listed and diffed (modified + added files); an unknown skill returns not-ok; and `reset --restore` clears the modified state so revert and discovery stay consistent.	2026-06-18 12:26:20 +05:30
kshitij	832d5967f8	Merge pull request #48262 from kshitijk4poor/salvage-32445 feat(memory): improve OpenViking setup UX (salvage #32445)	2026-06-18 11:34:11 +05:30
kshitijk4poor	1153b42b24	Merge upstream/main into OpenViking setup-UX (salvage #32445 ) Resolves conflicts from the OpenViking churn that merged after #32445 was opened (#48042/#47662 session-switch + write hardening, #47311/#47973): - plugins/memory/openviking/__init__.py: keep both __init__ field groups (the PR's _runtime_start_* alongside main's _prefetch_threads/_shutting_down). - tests/plugins/memory/test_openviking_provider.py: keep BOTH the PR's new setup-validation tests and main's session-switch/concurrency tests (disjoint additions to the same region). Two fixes layered while reconciling (contributor work otherwise preserved): - Restore the merged tenant-header contract (#22414/#21232). The PR had changed _VikingClient defaults to '' and made empty account/user OMIT the tenant headers; main's contract is that empty falls back to 'default' and the X-OpenViking-Account/User headers are ALWAYS sent (ROOT API keys need them). Reverted the constructor to 'account or os.environ.get(..., "default")' and updated the two PR tests that asserted the omit-when-empty behavior. - Close a secret-file TOCTOU in the setup writers. _write_env_vars and _write_ovcli_config wrote the api_key/root_api_key file and chmod 0600 AFTERWARD, leaving a world-readable window on newly-created files. Added _precreate_secret_file() to create with 0600 before any secret bytes land.	2026-06-18 11:28:51 +05:30
Ben Barclay	c661634537	fix(dashboard): stream file uploads via multipart instead of base64 JSON (NS-501) (#47663 ) * fix(dashboard): stream file uploads via multipart instead of base64 JSON The dashboard file manager uploaded files (including backup/restore zip archives) by reading them client-side with FileReader.readAsDataURL and POSTing a base64 data URL inside a JSON body to /api/files/upload. For a large backup this (a) inflates the payload ~33%, (b) buffers the whole file plus its decoded copy in memory, and (c) reliably trips an upstream proxy body-size/timeout limit, surfacing as a 502 with the upload appearing to hang indefinitely (NS-501). Dashboard-only hosted users have no shell fallback to place the archive, so backup restore was unusable. Add a streaming multipart endpoint POST /api/files/upload-stream (UploadFile + Form) that reads the request body in 1 MiB chunks straight to a sibling temp file, enforces the existing 100 MB size cap as it streams (413 on overflow, before buffering the whole file), and atomically renames into place so a partial/aborted/over-limit upload never clobbers an existing file. The frontend api.uploadFile now sends multipart/form-data (raw bytes, no base64, browser-set boundary) and FilesPage passes the File object directly; the dead readAsDataUrl helper is removed. The legacy base64 JSON endpoint stays for backward compat. FastAPI's UploadFile/Form require python-multipart, which is NOT pulled in by fastapi itself, so it is added to the base deps, the [web] extra, and the tool.dashboard lazy-install set (kept in sync). Validated: 5 new endpoint tests (roundtrip, multi-chunk >1 MiB, over-limit 413 without clobbering + no temp-file leak, overwrite=false conflict, forced-root traversal containment); existing base64 tests still pass; web typecheck + vite build clean; and a real uvicorn server E2E (5 MB multipart upload -> HTTP 200 in 0.21s, exact byte match) plus a 30 MB TestClient roundtrip confirm constant-memory streaming end to end. Reported via beta (NS-501). * build(deps): regenerate uv.lock for python-multipart (NS-501) CI ran uv lock --check / uv sync --locked which failed because the python-multipart dependency add was not reflected in uv.lock. Regenerate the lockfile (resolves to 0.0.20, matching the [web] extra pin) after merging current main.	2026-06-18 15:54:32 +10:00
Ben Barclay	9c3c5da356	fix(backup): hermes import never overwrites volatile gateway runtime state (NS-501) (#48243 ) Importing a backup wrote every file from the zip over the target home wholesale. On a hosted instance this clobbered gateway_state.json with the source machine's last recorded run/desired state — driving the container-boot reconciler (container_boot._read_desired_state, which only auto-starts a gateway whose state is "running") off stale/foreign state and leaving the gateway stuck "starting", disconnected from the Nous portal. Add _IMPORT_SKIP_NAMES (gateway_state.json, gateway.pid, cron.pid, gateway.lock, processes.json) and skip them by basename in run_import, so both the root profile and named profiles preserve the target's own runtime state. This mirrors what container_boot._STALE_RUNTIME_FILES already sweeps on every container boot, and protects against older backups that predate the backup-side exclusions. The import summary reports which files were preserved. This is the second half of NS-501 (filed separately as NS-508): the upload 502 was fixed in #47663; this fixes the import-breaks-the-instance half.	2026-06-18 15:27:45 +10:00
Ben Barclay	0ddd21c74e	feat(relay): managed-boot self-provision client (Phase 3, gateway side) (#48242 ) The gateway half of relay Phase 3. On a MANAGED boot with relay configured and no secret pinned, the runtime self-provisions its relay credentials IN-PROCESS: resolve the agent's own Nous access token (resolve_nous_access_token) -> POST the connector's /relay/provision asserting its own endpoint + route keys -> set GATEWAY_RELAY_ID/SECRET/DELIVERY_KEY into os.environ so the immediately- following register_relay_adapter() reads them and dials out authenticated. No human, no enrollment token, no disk write — the creds live only in process memory (save_env_value refuses under managed anyway, and keeping the secret off any volume is the stronger posture). Stateless: process-env creds don't survive a restart, so a managed container re-provisions every boot; the connector's rotation window covers a still-connected prior instance. An explicitly-pinned GATEWAY_RELAY_SECRET is respected (skip). Self-hosted is unchanged: humans keep using `hermes gateway enroll`. Endpoint provenance is gateway-asserted (GATEWAY_RELAY_ENDPOINT + GATEWAY_RELAY_ROUTE_KEYS, env or gateway.relay_* config) — uniform code path whether the operator sets it (self-hosted) or NAS stamps it (hosted, the only case NAS knows the public URL). Both absent -> outbound-only provisioning (credentials, no inbound routes). The connector scopes the asserted endpoint to the verified tenant, so it stays within the security model. - gateway/relay/__init__.py: relay_endpoint(), relay_route_keys(), _provision_url(), _post_provision(), self_provision_if_managed() (never raises — a provision failure logs and boots without relay auth). - gateway/run.py: call self_provision_if_managed() immediately before register_relay_adapter() in the startup path. Tests: 12 unit (trigger logic, respect-pinned-secret, in-process env wiring, endpoint+routes vs outbound-only, fail-soft on token/connector failure); mutation-checked (drop is_managed guard / pinned-secret guard -> tests fail). Cross-repo live E2E driver lands on the connector side (depends on this). EXPERIMENTAL: relay auth scheme may change until >=2 Class-1 platforms validate.	2026-06-18 15:25:29 +10:00
Ben Barclay	4440d77bf3	fix(update): scope install-method stamp to the code tree, not $HERMES_HOME (#48188 ) The install method (docker/git/pip/...) describes the running binary, but detect_install_method() read it from $HERMES_HOME/.install_method — a shared DATA directory. The Docker docs deliberately bind-mount $HERMES_HOME (~/.hermes:/opt/data) so config/sessions/memory persist and can be shared with a host-side Desktop/CLI install. When a containerized gateway and a host install share one $HERMES_HOME, the home-scoped stamp is a single slot describing two installs: the published image stamps 'docker' on every boot, the host install then reads 'docker' and the in-app updater refuses to run 'hermes update' ("doesn't apply inside the Docker container"). Reinstalling the Desktop app from the DMG doesn't help because the contaminated stamp is re-read every time. Fix (option 1 — code-scoped stamp): - detect_install_method() reads <install tree>/.install_method first (next to the running code, immune to the shared data dir). It falls back to the legacy $HERMES_HOME stamp for back-compat, but IGNORES a 'docker' home stamp when not actually containerized — so already-poisoned shared homes self-heal. - stamp_install_method() writes the code-scoped stamp. - install.sh stamps $INSTALL_DIR instead of $HERMES_HOME. - Dockerfile bakes 'docker' into /opt/hermes/.install_method at build time (inside the immutable block); stage2-hook.sh no longer writes the home stamp and proactively removes a stale 'docker' one to heal existing shared homes. Genuine containers still resolve to 'docker' (baked stamp, or legacy home stamp honored when containerized). Unstamped installs in generic containers still fall through to git/pip (preserves the #34397 fix).	2026-06-18 14:14:41 +10:00
Gille	3769dff5dd	fix(approval): honor glob command allowlist entries (#43051 ) * fix(approval): honor glob command allowlist entries * fix(approval): guard allowlist globs from shell chaining	2026-06-18 12:48:36 +10:00
Ben Barclay	c276b017ad	feat(relay): connector⇄gateway channel auth + signed-HTTP inbound receiver + enroll CLI (#48147 ) * feat(relay): authenticate the connector⇄gateway WS channel The relay gateway may be customer-managed and internet-exposed, so the connector⇄gateway channel is itself authenticated (distinct from the platform crypto the relay path sheds). Add gateway/relay/auth.py — a Python port of the connector's HMAC token + delivery-signature schemes (relayAuthToken.ts / deliverySigning.ts), verified byte-for-byte against the connector's compiled TypeScript via cross-language test vectors. Present an Authorization bearer on the /relay WS upgrade keyed by the per-gateway secret (resolved from GATEWAY_RELAY_ID / GATEWAY_RELAY_SECRET in env or config). The connector rejects an unauthenticated/invalid/ revoked upgrade with close 4401. * feat(relay): signed-HTTP inbound delivery receiver The connector delivers normalized inbound events to a tenant's gateway over a signed HTTP POST, not the outbound /relay WS: the connector instance owning a platform socket is generally not the instance a given gateway dialed out to, so inbound targets a tenant endpoint that may load-balance across gateway instances. Add gateway/relay/inbound_receiver.py — verifies x-relay-signature / x-relay-timestamp over the EXACT raw request bytes (re-serializing would break the HMAC: JS JSON.stringify is compact, Python json.dumps spaces) against the per-tenant delivery key verify list within a 300s replay window, then dispatches messages to handle_message and interrupts to the interrupt handler. Wire it into the adapter lifecycle (start in connect() when a delivery key + bind port are configured, tear down in disconnect(); a purely-outbound dev gateway runs without it). Refine test_relay_sheds_crypto to distinguish PLATFORM crypto (Discord ed25519, Twilio/WeCom HMAC — still shed) from the connector⇄gateway CHANNEL auth (intended): auth.py / inbound_receiver.py are exempt from the platform-symbol scan but still banned from importing platform-crypto modules, plus a positive guard that auth.py uses only stdlib hmac/hashlib. * feat(relay): hermes gateway enroll CLI Add the gateway half of zero-touch enrollment. `hermes gateway enroll` resolves a fresh Nous Portal access token (the tenant-proving identity), POSTs {enrollmentToken, gatewayId} to the connector's /relay/enroll, and persists GATEWAY_RELAY_ID / GATEWAY_RELAY_SECRET / GATEWAY_RELAY_DELIVERY_KEY to ~/.hermes/.env. The per-gateway secret authenticates the WS upgrade; the per-tenant delivery key verifies signed inbound deliveries. Refuses under is_managed() (hosted installs get the secret stamped in by the orchestrator). Added as an 'enroll' subcommand on the existing gateway subparser — not a new top-level command. * docs(relay): inbound is signed HTTP, not WS; document channel auth Fix the stale contract: §3/§5 said inbound rode the WS socket (single- instance only, predates the multi-instance socket-ownership + channel-auth model). Inbound + connector→gateway interrupt are signed HTTP POSTs to the tenant endpoint. Add §6.1 documenting the two channel-auth schemes (per- gateway WS-upgrade secret, per-tenant inbound delivery key) and how they differ from the platform crypto the relay path sheds. * test(relay): update build_gateway_parser callers for cmd_gateway_enroll The enroll subcommand added cmd_gateway_enroll as a required keyword-only arg to build_gateway_parser, but two existing parser-extraction tests still called it with only cmd_gateway/cmd_proxy — failing CI with TypeError. Thread the new handler through both call sites and add a test asserting `gateway enroll` dispatches to cmd_gateway_enroll with its flags parsed.	2026-06-18 12:01:54 +10:00
Ben Barclay	fcf6cb3d73	fix(docker): supervised gateway uses --replace to take over stale holder (NS-505) (#47555 ) * fix(docker): supervised gateway uses --replace to take over stale holder Inside the s6 container image the per-profile gateway service rendered a bare `hermes gateway run` (no --replace). When a gateway is started OUTSIDE s6 — a stray shell `hermes gateway run`, an agent action, or the Open WebUI helper (scripts/setup_open_webui.sh) — it grabs the per-HERMES_HOME PID lock first. The supervised slot then execs the bare `gateway run`, hits the "Another gateway instance is already running" guard, exits non-zero, and s6 restarts it: a restart loop that floods the log every ~12s and never binds. The container looks up but the gateway is permanently down, and dashboard-only users (no shell) cannot recover. Render the supervised run script as `gateway run --replace` so s6 is authoritative for its slot: it reaps the stale holder via the hardened takeover path (takeover marker + SIGTERM->SIGKILL-with-confirmation + scoped-lock cleanup in gateway/run.py) and binds. This matches the systemd service path, which already builds its argv with --replace (_build_gateway_argv / 'nohup hermes gateway run --replace'), and the intent already documented in _maybe_redirect_run_to_s6_supervision. The existing HERMES_S6_SUPERVISED_CHILD sentinel still prevents the run->start->run redirect recursion. Each profile is scoped to its own HERMES_HOME and s6 guarantees one supervised instance per slot, so there is no legitimate supervised sibling for --replace to clobber. Reported via beta (NS-505): gateway.log showed PID 17907 'running (manual process)' with the guard error repeating every ~12s on v2026.6.5. Adds a regression test asserting every gateway-run exec line in the rendered script (default + named profile, both privilege branches) carries --replace, and updates the existing render-script assertion. * fix(ci): remove stray .venv symlink committed into repo The PR's commit accidentally tracked a .venv symlink pointing at the developer's local venv (mode 120000 -> /home/ben/nous/hermes-agent/.venv). The CI test/e2e/build jobs run `uv venv` to create .venv and failed with `failed to create directory .venv: File exists (os error 17)` because the checkout already contained the symlink. All test shards aborted in <15s during setup, before any test ran. Untrack the symlink and add a bare `.venv` entry to .gitignore (the existing `.venv/` rule only matches a directory, so a symlink slipped through).	2026-06-18 10:49:02 +10:00
teknium1	c5eb64b9f7	fix(xai): scope native web_search to swap-only + reconcile composer ctx to 200k Salvage corrections on top of @XVVH's #44341: - Make native web_search injection a 1:1 swap for an already-present client web_search function, NOT an additive grant. The original unconditionally appended {"type":"web_search"} on every is_xai_responses turn with any tools, force-enabling Grok server-side search even when the user never enabled the web toolset (bypassing Hermes web-provider config + tool-trace plumbing). Now gated on a client web_search actually being present. - Reconcile grok-composer context to 200000 (merged in #47908) rather than 262144; 200k is xAI's published usable context window for Composer 2.5, 262144 is the /v1/responses input+output budget. - Update tests to match scoped behavior + add a no-web-toolset guard test. - AUTHOR_MAP entry for #44341 salvage. Incomplete-guard (server-side *_call items at in_progress no longer flip has_incomplete_items) and preflight built-in-tool allowlist kept as-is.	2026-06-17 17:33:32 -07:00
XVVH	6f89e17a33	fix(xai): OAuth Responses native web_search, incomplete guard, grok-composer context - model_metadata: grok-composer-2.5-fast → 262144 (OAuth slug not in /v1/models) - codex transport: inject native {"type":"web_search"} for is_xai_responses; drop client web_search to avoid duplicate-name 400s - codex adapter: do not treat in-progress server-side *_call items as incomplete - tests: adapter, transport build_kwargs, model_metadata, oauth recovery	2026-06-17 17:33:32 -07:00
Teknium	020e59d3cf	fix(agent): dampen empty-name phantom tool-call loop (#47967 ) (#48109 ) Weak open models (mimo, nemotron-class) that see tool-call XML/JSON sitting in file contents or tool output get primed and emit their own structured tool calls mimicking the payload — usually with an empty/whitespace name. Those calls can't be fuzzy-repaired toward a real tool, so the dispatch loop returns an error and the model retries. Before this fix, every empty-name error dumped the full tool catalog back to the model, which fed the priming loop more names to mimic and inflated context 3-4x across the retry budget. A blank/whitespace-only tool name now gets a terse anti-priming error that tells the model in-context tool-call syntax is DATA, with no catalog dump. A genuinely-wrong-but-nonempty name (a real typo) still gets the full catalog so the model can self-correct. Not a sandbox/auth boundary issue: Hermes never parses tool-call text from content into executable calls (structured tool_calls only; the lone text->call parser is the Copilot ACP transport and it also rejects empty names). The reporter's own debug dump confirms the injection never executed. Behavior-contract test added: empty-name -> terse error, no catalog; nonempty unknown -> catalog preserved. Exercised end-to-end via run_conversation against an in-process mock provider.	2026-06-17 17:32:14 -07:00
Teknium	9ba4615db2	fix(dump): show commit date instead of release date in hermes debug (#48104 ) * feat(mcp): raise default tool-call timeout 120s -> 300s Port from openai/codex#28234. Long-running MCP tools (web fetches, sandboxed builds, deep-research servers) routinely exceed 120s, causing spurious timeout failures. Codex bumped its default MCP tool timeout from 120 to 300 for the same reason. - _DEFAULT_TOOL_TIMEOUT 120 -> 300 in tools/mcp_tool.py (per-server 'timeout' config override unchanged) - update test_default_timeout assertion - document the default in mcp-config-reference.md * fix(dump): show commit date instead of release date in hermes dump The version line in `hermes dump` (the top of the /debug report) appended the package release date in parentheses, which reads like a wall-clock "generated at" timestamp and confuses support triage. Replace it with the date the HEAD commit was actually made, resolved live via `git log -1 --format=%cd --date=short`, kept next to the commit SHA. On Docker/wheel installs with no .git the date resolves to '' and the suffix is simply omitted (the baked SHA still identifies the build).	2026-06-17 16:53:42 -07:00
brooklyn!	c1f9eb0ec4	fix(desktop): resolve electronDist dynamically + self-heal blocked installs (supersedes #48081/#48082) (#48091 ) * fix(desktop): resolve electronDist dynamically + self-heal blocked installs Supersedes the static-path approach (#48081) and the install-step self-heal (#48082) with a fix that removes the whole failure class instead of chasing each symptom. Three distinct faults converged into the June desktop-build outage; this closes all three. Root cause (the part #48081 left open — "Gap B"): build.electronDist was a static relative path in apps/desktop/package.json, but npm workspace hoisting is NOT deterministic — depending on the npm version and what else is installed, npm nests the workspace-only electron devDep under apps/desktop/node_modules/electron OR hoists it to the repo root. A static path matches only one layout, so a clean install intermittently fails with "The specified electronDist does not exist". #48081 re-pointed the path at the nested layout (correct today) but electron-builder reads electronDist STATICALLY, so any future hoist change silently breaks it again — only caught by a CI invariant, never self-corrected. Fix: - scripts/run-electron-builder.cjs: resolve electron the way Node's runtime does — require.resolve("electron/package.json") walks node_modules from the desktop project upward and finds electron wherever npm actually put it. The path can never drift out of sync with the install layout again, on any OS/npm version. * dist present -> pass -c.electronDist=<abs>/dist so electron-builder reuses the unpacked runtime (keeps the #38673 fast path that dodges the 26.8.x missing-binary re-unpack bug). * dist absent -> omit electronDist; electron-builder fetches Electron itself via @electron/get honoring electronVersion + ELECTRON_MIRROR. package.json: builder script now runs the wrapper; the static build.electronDist is removed (the resolver owns it). - main.py / install.sh / install.ps1: on a dependency-install failure where the electron package staged but its dist is missing (electron's install.js process.exit(1) on a blocked/throttled binary download — #47266/#47917/#48021), repopulate the dist via electron's downloader (canonical, then npmmirror.com) and CONTINUE to the build instead of aborting. npm runs postinstall LAST, so the only casualty is electron/dist; bailing here is what made the pack-time mirror self-heal unreachable on a blocked network. Hard-fail only when electron never staged at all (a genuine dependency error). - The pack-time mirror fallback now retries the build even when the pre-fetch can't populate the dist: the wrapper lets electron-builder download Electron itself via the mirror, so the retry is no longer a no-op (it was, when electronDist was a static path). The exact 40.10.2 pin (already on main) keeps the third mode — the native @electron-internal/extract-zip win32 binding that 40.10.3/40.10.4 ship without a published prebuild — from recurring. Tests: - test_desktop_electron_pin.py: replace the static-path-matches-lockfile invariant with contracts that there is no hardcoded electronDist to drift, the builder script routes through the resolver, and the resolver uses Node module resolution + injects -c.electronDist. - test_gui_command.py: install-failure self-heal continues to build; genuine (electron-never-staged) install failure still hard-fails; pack retries under the mirror even when the pre-fetch is blocked. Salvages/supersedes the overlapping community work in #48003 (sitkarev), #48012 (omegazheng), #48033 (james47kjv), and #48082. Co-authored-by: sitkarev <59806492+sitkarev@users.noreply.github.com> Co-authored-by: omegazheng <zheng@omegasys.eu> Co-authored-by: james47kjv <220877172+james47kjv@users.noreply.github.com> * fix(desktop): narrow Electron self-heal to real missing-dist failures Follow-up on #48091 to remove the remaining misdiagnosis risk from the installer/build fallback path (#46785 concern): only take the Electron repair/retry path when Electron's package files are staged and dist is actually missing/corrupt. - main.py: add _electron_pkg_staged_missing_dist() and use it to gate install failure recovery; fail fast for unrelated npm install errors. - main.py/install.sh/install.ps1: run cache purge + retry only when dist is missing; do not retry unrelated tsc/vite/build failures under an Electron-specific narrative. - install.sh/install.ps1: tighten install-stage self-heal guard to require both package.json + install.js and missing dist. - tests: add coverage that install failure hard-fails when Electron dist already exists, and update retry test to reflect the tightened recovery condition. Validation: - Python tests: 64 passed - install.sh-related tests included in the run - Real mac build on this machine: - npm ci at repo root: success - cd apps/desktop && npm run pack: success - electron-builder packaged darwin arm64 and used custom unpacked Electron dist * refactor(desktop): trim electron self-heal helpers and comments Deduplicate mirror-retry into _try_redownload_electron_dist / shell counterparts; shorten wrapper and install-script commentary without changing recovery semantics. --------- Co-authored-by: sitkarev <59806492+sitkarev@users.noreply.github.com> Co-authored-by: omegazheng <zheng@omegasys.eu> Co-authored-by: james47kjv <220877172+james47kjv@users.noreply.github.com>	2026-06-17 18:48:35 -05:00
Ben	acc8916ac7	test(gateway): live ws-transport round-trip + config-driven registration - test_ws_transport.py: drives WebSocketRelayTransport against a REAL in-process websockets server (not a mock socket): handshake (hello->descriptor), inbound frame -> handler, outbound request/response correlation, follow_up routing, and clean disconnect failing pending waiters. Skips if websockets is absent. - test_relay_registration.py: rewritten for the config-driven gate — registers when GATEWAY_RELAY_URL is set / an explicit url is passed / force=True; no-op without a URL; trailing slash stripped; adapter constructs through the registry. Full relay suite: 57 passed.	2026-06-17 16:37:45 -07:00
Ben	3db9b3e616	feat(gateway): token-less follow_up outbound op (A2 capability action) The relay outbound surface had send/edit/typing but no way to act on a SHARED-identity capability (e.g. a Discord interaction follow-up token, ~15min) that the connector captured + stripped at the edge. Under A2 that credential never reaches the gateway, so the gateway can't just 'send with the token' — it needs a semantic op naming the session it's already in. Adds the follow_up op end to end on the gateway side: - RelayTransport.send_follow_up(action): protocol method. Action carries op='follow_up' + session_key + kind + content (+ metadata) and NO token. - RelayAdapter.send_follow_up(session_key, kind, content, metadata): builds that action and returns a SendResult. The connector resolves the real capability (its resolveOutboundCapability), enforces the tenant match so tenant B can't wield tenant A's capability, and egresses; success=False when the capability is absent/expired/mismatched (nothing to retry — a leaked gateway holds zero capability material). - StubConnector records follow_ups + a canned next_follow_up_result. Tests: round-trips without a token; the wire action carries only session refs (no credential value field — the 'kind' string is a type ref, not the secret); failure surfaces when the connector can't resolve; no-transport fails cleanly. 55 passed. §4 doc entry follows in the contract-rewrite commit.	2026-06-17 16:37:45 -07:00
Ben	c28a02b49d	test(gateway): shed platform crypto from the relay path (A2 invariant) Under the A2 trust model the connector is the SOLE crypto/identity boundary: it verifies/decrypts every inbound platform payload at the edge (it holds the tenant secrets), normalizes to a tenant-scoped MessageEvent, and forwards only the sanitized event. The gateway re-validates nothing — it cannot without being handed the shared signing secret, which on a shared bot is itself the cross-tenant leak. The relay path already imports no platform-crypto today; this locks that in as an enforced invariant so nobody bolts re-validation (Discord ed25519, Twilio HMAC, WeCom BizMsgCrypt, generic webhook signature checks) onto the relay later and silently re-couples the gateway to platform secrets it must never hold. Verification stays in the direct platform adapters (gateway/platforms/) which serve non-relay deployments. - test_relay_package_imports_no_platform_crypto: AST-walks gateway/relay/ and fails on any import of a platform-crypto/verification module. - test_relay_package_calls_no_signature_verification: fails on any verification-symbol reference (ed25519/hmac/bizmsg/verify_*). Invariants (assert the relation 'relay re-validates nothing'), not frozen snapshots. Verified the guard bites: injecting a wecom_crypto import makes it fail, removing it goes green. docs §6 rewrite follows in a later commit.	2026-06-17 16:37:45 -07:00
Ben	e74577ed0f	test(gateway): Telegram relay round-trip (Phase 1 generalization proof) The Phase 1 exit gate requires BOTH Discord and Telegram to round-trip through the relay stub, but test_relay_roundtrip.py only covered Discord. Add the Telegram companion exercising its distinct discriminator profile: - no guild_id — two chats isolate on chat_id alone - forum topics share one chat_id and isolate by thread_id (the Telegram analog of Discord per-guild isolation), shared across participants by default (thread_sessions_per_user=False) - DM isolation by chat_id - utf16 len_unit + markdown_v2 dialect round-trip and configure the adapter - outbound send round-trips through the stub Proves the CapabilityDescriptor + build_session_key generalize beyond Discord, not just the struct (which the descriptor unit tests already covered).	2026-06-17 16:37:45 -07:00
Ben	5feec8b4cf	test(gateway): enforce relay contract-doc ⟷ Python conformance Add an invariant test pinning docs/relay-connector-contract.md to the Python source of truth so the doc (which the connector repo mirrors by hand) cannot silently drift: - CapabilityDescriptor §2 table ⟷ dataclass fields + required/optional - SessionSource wire keys (to_dict output) ⟷ §3 documented fields - per-platform discriminator columns exist as real SessionSource fields - guard that is_bot stays off the wire until deliberately promoted Writing the test surfaced a real gap: §3 only enumerated 5 discriminators in its per-platform table while to_dict() emits 12 keys. Seven wire keys the connector must populate (chat_name, chat_topic, user_id_alt, chat_id_alt, parent_chat_id, message_id, user_name) were undocumented — a connector author reading the doc would never know to set them. Added a complete SessionSource wire-field table to §3. The connector's existing contract.ts already carries all 12, so no connector change is needed; the doc was the lagging artifact.	2026-06-17 16:37:45 -07:00
Ben	c803661cec	fix(gateway): register relay connection checker The platform-connected-checker invariant test requires every built-in Platform enum member to have either a generic token path or a bespoke entry in _PLATFORM_CONNECTED_CHECKERS. Platform.RELAY was added without one, so test_all_builtins_have_checker_or_generic_token_path failed. Relay dials OUT to a connector and is 'connected' once an endpoint URL is configured (extra['relay_url'] or extra['url']); the capability descriptor is negotiated at handshake time, so the URL is the only config-level signal in the experimental phase. Add the checker plus a synthetic-config case exercising its True path.	2026-06-17 16:37:45 -07:00
Ben	c366466d70	test(relay): assert connector stub never leaks into production paths CI guard: fails if gateway/ or plugins/ ever imports the test-only stub connector or defines StubConnector. Matches code leaks (imports / class defs), not prose mentions, so the transport.py docstring reference to the stub's path is allowed. Phase 1 complete. Task 1.6 of the gateway-relay plan.	2026-06-17 16:37:45 -07:00
Ben	a3cdd8c39d	feat(relay): route mid-turn /stop over relay interrupt channel RelayAdapter.on_interrupt(session_key, chat_id) bridges a connector-delivered mid-turn /stop into the existing interrupt_session_activity path, setting the per-session _active_sessions Event and clearing typing — cancelling exactly the targeted session's turn without touching siblings (mirrors test_stop_thread_ sibling isolation). Transport.send_interrupt carries the gateway-side egress to the connector for socket-owner routing. Phase 1, Task 1.4 of the gateway-relay plan.	2026-06-17 16:37:45 -07:00
Ben	d0133fd8e4	feat(relay): register RelayAdapter through platform registry (flagged off by default) register_relay_adapter() registers the generic 'relay' platform via the same PlatformRegistry path as plugin adapters — no core dispatch changes. OFF by default (dark-launch): only registers when HERMES_GATEWAY_RELAY is truthy (or force=True for tests), so existing single-tenant/direct deployments are unaffected. Factory builds a transport-less RelayAdapter with a placeholder descriptor; the real descriptor is negotiated at handshake. Phase 1, Task 1.3 of the gateway-relay plan.	2026-06-17 16:37:45 -07:00
Ben	259e78e175	feat(relay): transport protocol + test-only stub connector Defines RelayTransport (lifecycle/handshake/inbound/outbound/interrupt) as the gateway<->connector wire contract; RelayAdapter.connect now registers an inbound handler that bridges connector-delivered MessageEvents into handle_message. Adds an in-memory StubConnector under tests/ and an E2E round-trip proving: connect registers the handler, inbound events reach the adapter, guild_id drives build_session_key isolation (two guilds -> two keys; same guild/channel/user -> one), outbound send round-trips, get_chat_info is proxied. Phase 1, Task 1.2 of the gateway-relay plan.	2026-06-17 16:37:45 -07:00
Ben	b0999c82f3	feat(relay): generic RelayAdapter advertising negotiated capabilities One BasePlatformAdapter subclass that reads its capability profile from a CapabilityDescriptor: MAX_MESSAGE_LENGTH attribute, message_len_fn (table-driven by len_unit: chars=len, utf16=Telegram-style code units), supports_draft_streaming. Implements the four abstract methods (connect/disconnect/send/get_chat_info) by delegating to an injected RelayTransport (full protocol lands in Task 1.2). Adds Platform.RELAY enum member. No per-platform gateway code. Phase 1, Task 1.1 of the gateway-relay plan.	2026-06-17 16:37:45 -07:00

1 2 3 4 5 ...

5673 commits