hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-19 10:02:16 +00:00

Author	SHA1	Message	Date
Ben Barclay	c661634537	fix(dashboard): stream file uploads via multipart instead of base64 JSON (NS-501) (#47663 ) * fix(dashboard): stream file uploads via multipart instead of base64 JSON The dashboard file manager uploaded files (including backup/restore zip archives) by reading them client-side with FileReader.readAsDataURL and POSTing a base64 data URL inside a JSON body to /api/files/upload. For a large backup this (a) inflates the payload ~33%, (b) buffers the whole file plus its decoded copy in memory, and (c) reliably trips an upstream proxy body-size/timeout limit, surfacing as a 502 with the upload appearing to hang indefinitely (NS-501). Dashboard-only hosted users have no shell fallback to place the archive, so backup restore was unusable. Add a streaming multipart endpoint POST /api/files/upload-stream (UploadFile + Form) that reads the request body in 1 MiB chunks straight to a sibling temp file, enforces the existing 100 MB size cap as it streams (413 on overflow, before buffering the whole file), and atomically renames into place so a partial/aborted/over-limit upload never clobbers an existing file. The frontend api.uploadFile now sends multipart/form-data (raw bytes, no base64, browser-set boundary) and FilesPage passes the File object directly; the dead readAsDataUrl helper is removed. The legacy base64 JSON endpoint stays for backward compat. FastAPI's UploadFile/Form require python-multipart, which is NOT pulled in by fastapi itself, so it is added to the base deps, the [web] extra, and the tool.dashboard lazy-install set (kept in sync). Validated: 5 new endpoint tests (roundtrip, multi-chunk >1 MiB, over-limit 413 without clobbering + no temp-file leak, overwrite=false conflict, forced-root traversal containment); existing base64 tests still pass; web typecheck + vite build clean; and a real uvicorn server E2E (5 MB multipart upload -> HTTP 200 in 0.21s, exact byte match) plus a 30 MB TestClient roundtrip confirm constant-memory streaming end to end. Reported via beta (NS-501). * build(deps): regenerate uv.lock for python-multipart (NS-501) CI ran uv lock --check / uv sync --locked which failed because the python-multipart dependency add was not reflected in uv.lock. Regenerate the lockfile (resolves to 0.0.20, matching the [web] extra pin) after merging current main.	2026-06-18 15:54:32 +10:00
Ben Barclay	9c3c5da356	fix(backup): hermes import never overwrites volatile gateway runtime state (NS-501) (#48243 ) Importing a backup wrote every file from the zip over the target home wholesale. On a hosted instance this clobbered gateway_state.json with the source machine's last recorded run/desired state — driving the container-boot reconciler (container_boot._read_desired_state, which only auto-starts a gateway whose state is "running") off stale/foreign state and leaving the gateway stuck "starting", disconnected from the Nous portal. Add _IMPORT_SKIP_NAMES (gateway_state.json, gateway.pid, cron.pid, gateway.lock, processes.json) and skip them by basename in run_import, so both the root profile and named profiles preserve the target's own runtime state. This mirrors what container_boot._STALE_RUNTIME_FILES already sweeps on every container boot, and protects against older backups that predate the backup-side exclusions. The import summary reports which files were preserved. This is the second half of NS-501 (filed separately as NS-508): the upload 502 was fixed in #47663; this fixes the import-breaks-the-instance half.	2026-06-18 15:27:45 +10:00
Ben Barclay	4440d77bf3	fix(update): scope install-method stamp to the code tree, not $HERMES_HOME (#48188 ) The install method (docker/git/pip/...) describes the running binary, but detect_install_method() read it from $HERMES_HOME/.install_method — a shared DATA directory. The Docker docs deliberately bind-mount $HERMES_HOME (~/.hermes:/opt/data) so config/sessions/memory persist and can be shared with a host-side Desktop/CLI install. When a containerized gateway and a host install share one $HERMES_HOME, the home-scoped stamp is a single slot describing two installs: the published image stamps 'docker' on every boot, the host install then reads 'docker' and the in-app updater refuses to run 'hermes update' ("doesn't apply inside the Docker container"). Reinstalling the Desktop app from the DMG doesn't help because the contaminated stamp is re-read every time. Fix (option 1 — code-scoped stamp): - detect_install_method() reads <install tree>/.install_method first (next to the running code, immune to the shared data dir). It falls back to the legacy $HERMES_HOME stamp for back-compat, but IGNORES a 'docker' home stamp when not actually containerized — so already-poisoned shared homes self-heal. - stamp_install_method() writes the code-scoped stamp. - install.sh stamps $INSTALL_DIR instead of $HERMES_HOME. - Dockerfile bakes 'docker' into /opt/hermes/.install_method at build time (inside the immutable block); stage2-hook.sh no longer writes the home stamp and proactively removes a stale 'docker' one to heal existing shared homes. Genuine containers still resolve to 'docker' (baked stamp, or legacy home stamp honored when containerized). Unstamped installs in generic containers still fall through to git/pip (preserves the #34397 fix).	2026-06-18 14:14:41 +10:00
Ben Barclay	c276b017ad	feat(relay): connector⇄gateway channel auth + signed-HTTP inbound receiver + enroll CLI (#48147 ) * feat(relay): authenticate the connector⇄gateway WS channel The relay gateway may be customer-managed and internet-exposed, so the connector⇄gateway channel is itself authenticated (distinct from the platform crypto the relay path sheds). Add gateway/relay/auth.py — a Python port of the connector's HMAC token + delivery-signature schemes (relayAuthToken.ts / deliverySigning.ts), verified byte-for-byte against the connector's compiled TypeScript via cross-language test vectors. Present an Authorization bearer on the /relay WS upgrade keyed by the per-gateway secret (resolved from GATEWAY_RELAY_ID / GATEWAY_RELAY_SECRET in env or config). The connector rejects an unauthenticated/invalid/ revoked upgrade with close 4401. * feat(relay): signed-HTTP inbound delivery receiver The connector delivers normalized inbound events to a tenant's gateway over a signed HTTP POST, not the outbound /relay WS: the connector instance owning a platform socket is generally not the instance a given gateway dialed out to, so inbound targets a tenant endpoint that may load-balance across gateway instances. Add gateway/relay/inbound_receiver.py — verifies x-relay-signature / x-relay-timestamp over the EXACT raw request bytes (re-serializing would break the HMAC: JS JSON.stringify is compact, Python json.dumps spaces) against the per-tenant delivery key verify list within a 300s replay window, then dispatches messages to handle_message and interrupts to the interrupt handler. Wire it into the adapter lifecycle (start in connect() when a delivery key + bind port are configured, tear down in disconnect(); a purely-outbound dev gateway runs without it). Refine test_relay_sheds_crypto to distinguish PLATFORM crypto (Discord ed25519, Twilio/WeCom HMAC — still shed) from the connector⇄gateway CHANNEL auth (intended): auth.py / inbound_receiver.py are exempt from the platform-symbol scan but still banned from importing platform-crypto modules, plus a positive guard that auth.py uses only stdlib hmac/hashlib. * feat(relay): hermes gateway enroll CLI Add the gateway half of zero-touch enrollment. `hermes gateway enroll` resolves a fresh Nous Portal access token (the tenant-proving identity), POSTs {enrollmentToken, gatewayId} to the connector's /relay/enroll, and persists GATEWAY_RELAY_ID / GATEWAY_RELAY_SECRET / GATEWAY_RELAY_DELIVERY_KEY to ~/.hermes/.env. The per-gateway secret authenticates the WS upgrade; the per-tenant delivery key verifies signed inbound deliveries. Refuses under is_managed() (hosted installs get the secret stamped in by the orchestrator). Added as an 'enroll' subcommand on the existing gateway subparser — not a new top-level command. * docs(relay): inbound is signed HTTP, not WS; document channel auth Fix the stale contract: §3/§5 said inbound rode the WS socket (single- instance only, predates the multi-instance socket-ownership + channel-auth model). Inbound + connector→gateway interrupt are signed HTTP POSTs to the tenant endpoint. Add §6.1 documenting the two channel-auth schemes (per- gateway WS-upgrade secret, per-tenant inbound delivery key) and how they differ from the platform crypto the relay path sheds. * test(relay): update build_gateway_parser callers for cmd_gateway_enroll The enroll subcommand added cmd_gateway_enroll as a required keyword-only arg to build_gateway_parser, but two existing parser-extraction tests still called it with only cmd_gateway/cmd_proxy — failing CI with TypeError. Thread the new handler through both call sites and add a test asserting `gateway enroll` dispatches to cmd_gateway_enroll with its flags parsed.	2026-06-18 12:01:54 +10:00
Ben Barclay	fcf6cb3d73	fix(docker): supervised gateway uses --replace to take over stale holder (NS-505) (#47555 ) * fix(docker): supervised gateway uses --replace to take over stale holder Inside the s6 container image the per-profile gateway service rendered a bare `hermes gateway run` (no --replace). When a gateway is started OUTSIDE s6 — a stray shell `hermes gateway run`, an agent action, or the Open WebUI helper (scripts/setup_open_webui.sh) — it grabs the per-HERMES_HOME PID lock first. The supervised slot then execs the bare `gateway run`, hits the "Another gateway instance is already running" guard, exits non-zero, and s6 restarts it: a restart loop that floods the log every ~12s and never binds. The container looks up but the gateway is permanently down, and dashboard-only users (no shell) cannot recover. Render the supervised run script as `gateway run --replace` so s6 is authoritative for its slot: it reaps the stale holder via the hardened takeover path (takeover marker + SIGTERM->SIGKILL-with-confirmation + scoped-lock cleanup in gateway/run.py) and binds. This matches the systemd service path, which already builds its argv with --replace (_build_gateway_argv / 'nohup hermes gateway run --replace'), and the intent already documented in _maybe_redirect_run_to_s6_supervision. The existing HERMES_S6_SUPERVISED_CHILD sentinel still prevents the run->start->run redirect recursion. Each profile is scoped to its own HERMES_HOME and s6 guarantees one supervised instance per slot, so there is no legitimate supervised sibling for --replace to clobber. Reported via beta (NS-505): gateway.log showed PID 17907 'running (manual process)' with the guard error repeating every ~12s on v2026.6.5. Adds a regression test asserting every gateway-run exec line in the rendered script (default + named profile, both privilege branches) carries --replace, and updates the existing render-script assertion. * fix(ci): remove stray .venv symlink committed into repo The PR's commit accidentally tracked a .venv symlink pointing at the developer's local venv (mode 120000 -> /home/ben/nous/hermes-agent/.venv). The CI test/e2e/build jobs run `uv venv` to create .venv and failed with `failed to create directory .venv: File exists (os error 17)` because the checkout already contained the symlink. All test shards aborted in <15s during setup, before any test ran. Untrack the symlink and add a bare `.venv` entry to .gitignore (the existing `.venv/` rule only matches a directory, so a symlink slipped through).	2026-06-18 10:49:02 +10:00
Teknium	9ba4615db2	fix(dump): show commit date instead of release date in hermes debug (#48104 ) * feat(mcp): raise default tool-call timeout 120s -> 300s Port from openai/codex#28234. Long-running MCP tools (web fetches, sandboxed builds, deep-research servers) routinely exceed 120s, causing spurious timeout failures. Codex bumped its default MCP tool timeout from 120 to 300 for the same reason. - _DEFAULT_TOOL_TIMEOUT 120 -> 300 in tools/mcp_tool.py (per-server 'timeout' config override unchanged) - update test_default_timeout assertion - document the default in mcp-config-reference.md * fix(dump): show commit date instead of release date in hermes dump The version line in `hermes dump` (the top of the /debug report) appended the package release date in parentheses, which reads like a wall-clock "generated at" timestamp and confuses support triage. Replace it with the date the HEAD commit was actually made, resolved live via `git log -1 --format=%cd --date=short`, kept next to the commit SHA. On Docker/wheel installs with no .git the date resolves to '' and the suffix is simply omitted (the baked SHA still identifies the build).	2026-06-17 16:53:42 -07:00
brooklyn!	c1f9eb0ec4	fix(desktop): resolve electronDist dynamically + self-heal blocked installs (supersedes #48081/#48082) (#48091 ) * fix(desktop): resolve electronDist dynamically + self-heal blocked installs Supersedes the static-path approach (#48081) and the install-step self-heal (#48082) with a fix that removes the whole failure class instead of chasing each symptom. Three distinct faults converged into the June desktop-build outage; this closes all three. Root cause (the part #48081 left open — "Gap B"): build.electronDist was a static relative path in apps/desktop/package.json, but npm workspace hoisting is NOT deterministic — depending on the npm version and what else is installed, npm nests the workspace-only electron devDep under apps/desktop/node_modules/electron OR hoists it to the repo root. A static path matches only one layout, so a clean install intermittently fails with "The specified electronDist does not exist". #48081 re-pointed the path at the nested layout (correct today) but electron-builder reads electronDist STATICALLY, so any future hoist change silently breaks it again — only caught by a CI invariant, never self-corrected. Fix: - scripts/run-electron-builder.cjs: resolve electron the way Node's runtime does — require.resolve("electron/package.json") walks node_modules from the desktop project upward and finds electron wherever npm actually put it. The path can never drift out of sync with the install layout again, on any OS/npm version. * dist present -> pass -c.electronDist=<abs>/dist so electron-builder reuses the unpacked runtime (keeps the #38673 fast path that dodges the 26.8.x missing-binary re-unpack bug). * dist absent -> omit electronDist; electron-builder fetches Electron itself via @electron/get honoring electronVersion + ELECTRON_MIRROR. package.json: builder script now runs the wrapper; the static build.electronDist is removed (the resolver owns it). - main.py / install.sh / install.ps1: on a dependency-install failure where the electron package staged but its dist is missing (electron's install.js process.exit(1) on a blocked/throttled binary download — #47266/#47917/#48021), repopulate the dist via electron's downloader (canonical, then npmmirror.com) and CONTINUE to the build instead of aborting. npm runs postinstall LAST, so the only casualty is electron/dist; bailing here is what made the pack-time mirror self-heal unreachable on a blocked network. Hard-fail only when electron never staged at all (a genuine dependency error). - The pack-time mirror fallback now retries the build even when the pre-fetch can't populate the dist: the wrapper lets electron-builder download Electron itself via the mirror, so the retry is no longer a no-op (it was, when electronDist was a static path). The exact 40.10.2 pin (already on main) keeps the third mode — the native @electron-internal/extract-zip win32 binding that 40.10.3/40.10.4 ship without a published prebuild — from recurring. Tests: - test_desktop_electron_pin.py: replace the static-path-matches-lockfile invariant with contracts that there is no hardcoded electronDist to drift, the builder script routes through the resolver, and the resolver uses Node module resolution + injects -c.electronDist. - test_gui_command.py: install-failure self-heal continues to build; genuine (electron-never-staged) install failure still hard-fails; pack retries under the mirror even when the pre-fetch is blocked. Salvages/supersedes the overlapping community work in #48003 (sitkarev), #48012 (omegazheng), #48033 (james47kjv), and #48082. Co-authored-by: sitkarev <59806492+sitkarev@users.noreply.github.com> Co-authored-by: omegazheng <zheng@omegasys.eu> Co-authored-by: james47kjv <220877172+james47kjv@users.noreply.github.com> * fix(desktop): narrow Electron self-heal to real missing-dist failures Follow-up on #48091 to remove the remaining misdiagnosis risk from the installer/build fallback path (#46785 concern): only take the Electron repair/retry path when Electron's package files are staged and dist is actually missing/corrupt. - main.py: add _electron_pkg_staged_missing_dist() and use it to gate install failure recovery; fail fast for unrelated npm install errors. - main.py/install.sh/install.ps1: run cache purge + retry only when dist is missing; do not retry unrelated tsc/vite/build failures under an Electron-specific narrative. - install.sh/install.ps1: tighten install-stage self-heal guard to require both package.json + install.js and missing dist. - tests: add coverage that install failure hard-fails when Electron dist already exists, and update retry test to reflect the tightened recovery condition. Validation: - Python tests: 64 passed - install.sh-related tests included in the run - Real mac build on this machine: - npm ci at repo root: success - cd apps/desktop && npm run pack: success - electron-builder packaged darwin arm64 and used custom unpacked Electron dist * refactor(desktop): trim electron self-heal helpers and comments Deduplicate mirror-retry into _try_redownload_electron_dist / shell counterparts; shorten wrapper and install-script commentary without changing recovery semantics. --------- Co-authored-by: sitkarev <59806492+sitkarev@users.noreply.github.com> Co-authored-by: omegazheng <zheng@omegasys.eu> Co-authored-by: james47kjv <220877172+james47kjv@users.noreply.github.com>	2026-06-17 18:48:35 -05:00
Teknium	f8098c6b6f	fix(desktop): resolve electronDist to the actual electron install location (#48081 ) After the June lockfile regeneration (#46652) floated electron and reshuffled npm workspace hoisting, the desktop pack fails with "The specified electronDist does not exist". apps/desktop/package.json pointed electronDist at the repo root (../../node_modules/electron/dist) while npm now installs electron nested under apps/desktop/node_modules/electron. The two contradict, so a clean install can never package the app (Windows + macOS). - electronDist -> node_modules/electron/dist (resolved relative to apps/desktop, i.e. the workspace-local install npm actually produces). - hermes_cli/main.py, scripts/install.sh, scripts/install.ps1: add a runtime electron-dir resolver that prefers apps/desktop/node_modules/electron and falls back to the root hoist, so dist checks + the mirror re-download work under either npm layout. - patch-electron-builder-mac-binary.cjs: try the workspace-local Electron.app before the root hoist in the macOS binary-restore fallback (sibling site no PR touched). - test: assert build.electronDist resolves to where the lockfile installs electron, so a future hoist change (root <-> nested) can't silently break it. Salvages the overlapping work in #48003 (sitkarev), #48012 (omegazheng), and #48033 (james47kjv). Co-authored-by: sitkarev <59806492+sitkarev@users.noreply.github.com> Co-authored-by: omegazheng <zheng@omegasys.eu> Co-authored-by: james47kjv <220877172+james47kjv@users.noreply.github.com>	2026-06-17 18:08:01 -05:00
kshitij	49d7481dfb	Merge pull request #47706 from NousResearch/fix/cli-login-deprecation-graceful fix(cli): deprecated `hermes login` fails gracefully for any provider	2026-06-17 23:02:32 +05:30
definitelynotguru	eaddeaf2e6	feat(xai): add grok-composer-2.5-fast to xAI OAuth model picker The model is callable via xAI OAuth but omitted from models.dev and /v1/models listings. Merge it into the curated xAI catalog so it appears in `hermes model` without requiring a custom model name.	2026-06-17 09:49:46 -07:00
Teknium	c6c8abbadb	refactor: remove agent-callable send_message tool (#47856 ) * feat(mcp): raise default tool-call timeout 120s -> 300s Port from openai/codex#28234. Long-running MCP tools (web fetches, sandboxed builds, deep-research servers) routinely exceed 120s, causing spurious timeout failures. Codex bumped its default MCP tool timeout from 120 to 300 for the same reason. - _DEFAULT_TOOL_TIMEOUT 120 -> 300 in tools/mcp_tool.py (per-server 'timeout' config override unchanged) - update test_default_timeout assertion - document the default in mcp-config-reference.md * refactor: remove agent-callable send_message tool The agent should not decide on its own to fire off cross-platform messages or reactions. Outbound platform messaging is handled outside the agent loop — cron delivery, the gateway kanban notifier (dashboard-toggled), and the `hermes send` CLI. Removes the model-tool registration only; the send engine in send_message_tool.py (_send_to_platform, _send_via_adapter, _parse_target_ref, per-platform _send_* helpers) is kept intact for those non-agent callers. Drops the now-empty 'messaging' toolset and its `hermes tools` toggle. Yuanbao DM guidance now points at the native yb_send_dm tool.	2026-06-17 07:11:23 -07:00
Teknium	cbfa018aef	fix(auth): retry Codex device-code login on 429 with clear rate-limit message (#47860 ) The OpenAI device-code login (POST auth.openai.com/.../deviceauth/usercode) had no retry or 429 handling — a transient throttle from OpenAI surfaced as a bare "Device code request returned status 429" with no guidance, reading as a hard login failure. - Retry the device-code request with capped exponential backoff (honoring Retry-After), up to 4 attempts. - On persistent 429, raise a clear AuthError tagged CODEX_RATE_LIMITED_CODE (classified transient, not a credential problem) with a wait hint. - Apply the same 429 classification to the token-exchange step (same bug class). Unrelated to PR #47399 (Responses-API cache headers); this is the OAuth device-code path in hermes_cli/auth.py.	2026-06-17 05:48:35 -07:00
teknium1	06d907dc4e	fix(dashboard): only run runtime-pid liveness fallback against local status get_runtime_status_running_pid() validates liveness with a local os.kill(pid, 0) probe. In /api/status the runtime record can be the REMOTE health-probe body (cross-container), whose PID belongs to another host and is display-only — probing it locally is wrong and trips the test live-system guard (os.kill on a PID outside the test subtree). Run the fallback only against the local read_runtime_status() record.	2026-06-17 05:40:57 -07:00
teknium1	dc86d48a3e	fix(dashboard): use await-safe config-only scope for /api/status profile _profile_scope swaps process-global skills_tool/skill_manager module attrs under an RLock; /api/status holds that scope across the run_in_executor remote-health probe await, so a concurrent /api/skills?profile=X request can cross-restore the status profile's skill dir on its finally. Add _config_profile_scope (contextvar-only, task-local, await-safe) and use it for status, which only resolves get_hermes_home() at call time for config/env/gateway state and never needs the skills-module globals.	2026-06-17 05:40:57 -07:00
Shannon Sands	674e8b098a	Fix dashboard gateway profile scoping	2026-06-17 05:40:57 -07:00
Teknium	f80381c456	feat(prompt): scale context-file cap to model window + point agent at truncated file (#47846 ) Context files (AGENTS.md, CLAUDE.md, .hermes.md, .cursorrules, SOUL.md) were hard-capped at a flat 20K chars before head/tail truncation. Among the agent harnesses we track, only Codex caps project docs at all (32 KiB); Claude Code, OpenCode, and Cline load them whole. The flat 20K predates large context windows and silently truncates real-world AGENTS.md files. B — dynamic cap: when context_file_max_chars is unset (now the shipped default), the cap scales with the model's context window (ctx_tokens * 4 * 0.06, floor 20K, ceiling 500K). Small-context models stay at the historical 20K; a 200K model gets 48K; large models stop truncating real docs. An explicit context_file_max_chars still wins. Context length is resolved once per conversation (stable -> prompt cache untouched). C — when truncation does happen, the marker now names the concrete file path and tells the agent to read_file it for the full content. Validation: 154 targeted tests + full agent/ + hermes_cli/ + test_config (0 failures); E2E against a real 60K AGENTS.md confirms small windows truncate with the path-bearing marker, large windows load whole, and the system prompt is byte-stable across rebuilds.	2026-06-17 05:40:26 -07:00
Teknium	7bbffceb9c	feat(curator): make skill consolidation opt-in (prune stays default-on) (#47840 ) The curator now defaults to prune-only: the deterministic inactivity pass (mark stale / archive long-unused skills) still runs whenever the curator is enabled, but the opinionated LLM umbrella-building consolidation fork is OFF by default. - agent/curator.py: add DEFAULT_CONSOLIDATE=False + get_consolidate(); gate the forked aux-model review in run_curator_review behind it (new consolidate param, None=read config). When off, the LLM pass is skipped entirely (no aux-model cost); the run is still recorded and reported. - config.py: add curator.consolidate (default false); v29->v30 migration seeds the key for existing installs without clobbering a user-set value. - hermes_cli/curator.py: 'hermes curator run --consolidate' override; status shows consolidate state; prune-only notice on run. - docs + tests.	2026-06-17 05:20:32 -07:00
Teknium	e48803daec	fix(gateway): defer macOS launchd reload when run inside the gateway tree (#47842 ) When refresh_launchd_plist_if_needed() runs from inside the gateway's own launchd process tree (agent-initiated self-update via the terminal tool), a direct launchctl bootout tears down the service's process group — including the CLI doing the refresh — before the follow-up bootstrap can run. The gateway is left unloaded and KeepAlive can't revive it (#43842). Detect in-service execution via gateway.status.get_running_pid() + _is_pid_ancestor_of_current_process(), and delegate the bootout->bootstrap to a detached (start_new_session=True) helper that survives the process-group teardown. The normal out-of-tree CLI path is unchanged. Fixes #43842.	2026-06-17 05:19:21 -07:00
kshitijk4poor	a7ec334448	fix(cli): deprecated `hermes login` fails gracefully for any provider `hermes login` was removed in favor of `hermes auth` / `hermes model`, but the subparser still validated `--provider` against a hardcoded choices list (nous, openai-codex, xai-oauth). Running `hermes login --provider anthropic` therefore crashed in argparse with `invalid choice: 'anthropic'` before the deprecation handler could print the redirect to `hermes model` — so a user trying to authenticate a perfectly valid provider just saw a hard error and assumed the feature was broken rather than relocated. - Drop the restrictive `choices=` so every `--provider` value reaches the deprecation handler (which ignores the value and prints guidance). - Omit the subparser `help=` kwarg so the dead command no longer advertises itself in `hermes --help` (#24756). Avoids the `==SUPPRESS==` placeholder leak that `help=argparse.SUPPRESS` emits for a top-level subparser on 3.12+. - `hermes login [--flags]` still reaches the actionable deprecation message for old scripts/aliases; `hermes login --help` shows the redirect. Picks up the intent of the inactivity-closed #24902, rebased onto the post-refactor parser location (hermes_cli/subcommands/login.py) and extended to fix the whole bug class (any provider value), not just hiding from --help. Tests: parametrized provider acceptance + help-suppression (no SUPPRESS leak).	2026-06-17 12:55:40 +05:30
kshitijk4poor	ca6542f602	docs(cli): note URL exclusion in _extract_path_word docstring The docstring described a token as path-like when it contains a "/" separator, but the keystroke-latency fix now excludes "://" scheme tokens (URLs) even though they contain "/". Document the exclusion so the contract matches the behavior.	2026-06-17 12:36:01 +05:30
xxxigm	f48b312037	fix(cli): keep typing responsive by not blocking the keystroke loop The interactive CLI input box runs its completer with `complete_while_typing=True`, so `SlashCommandCompleter.get_completions` is invoked on every keystroke. That completer does blocking I/O: fuzzy `@`-file indexing shells out to `rg`/`fd` (up to a 2s timeout) and file-path completion calls `os.listdir` + `stat`. Because the completer was passed inline (never wrapped in `ThreadedCompleter`), all of this ran synchronously on the prompt_toolkit event loop, stalling the render after each key — very noticeable on WSL2 and other slow-filesystem setups ("typing in the prompt box being very latent"). Two fixes: - Wrap the input completer in `ThreadedCompleter` so completion work runs off the UI event loop and never blocks rendering between keystrokes. - Stop treating URLs as file paths in `_extract_path_word`: a token like `https://example.com/x` contains `/`, so it triggered `os.listdir` on every keystroke while typing/pasting a link (listing a bogus `https:` dir) for a completion that can never be useful. Skip any token with a `://` scheme separator. (cherry picked from commit `b5be2ba276`)	2026-06-17 12:32:38 +05:30
teknium	36ae958473	feat(gateway): gate message timestamps behind opt-in (default off) Follow-up to salvaged PR #41633: the timestamp prefix injection was unconditional. Gate the in-context render behind gateway.message_timestamps.enabled (default false) at both the live-message and history-replay sites; timestamp metadata is still captured + persisted regardless so the toggle can be flipped on later. Add DEFAULT_CONFIG entry, docs, and gate tests.	2026-06-16 15:49:59 -07:00
xxxigm	d1ecebcbfd	fix(desktop): re-download Electron binary via mirror when pack fails (#47266 ) (#47276 ) * fix(desktop): re-download Electron binary via mirror when pack fails (#47266) Since #38673 pinned build.electronDist to node_modules/electron/dist, electron-builder reads the Electron binary straight from there and never downloads it during `npm run pack`. That dist tree is only produced by the electron package's postinstall (install.js) during `npm ci`. When that download is blocked or throttled (GitHub's release host is unreachable in some regions), the dist is missing and the build dies with: The specified electronDist does not exist: .../node_modules/electron/dist The existing ELECTRON_MIRROR fallback in all three desktop-build paths (scripts/install.ps1, scripts/install.sh, and `hermes desktop` in hermes_cli/main.py) re-ran `npm run pack` with ELECTRON_MIRROR set — but pack never downloads Electron anymore, so the mirror was never used and the retry re-read the same missing dist. The fallback was effectively dead. Drive the mirror through electron's own downloader instead: - Add a dist-presence check + a downloader helper (Test-ElectronDist / Restore-ElectronDist, _electron_dist_ok / _restore_electron_dist, _electron_dist_ok / _redownload_electron_dist) that wipes a partial dist + the path.txt version marker (electron's install.js short-circuits on it) and re-runs `node install.js`, optionally via a mirror. - On the first retry, repopulate a missing dist from the canonical source; on the mirror retry, re-fetch through npmmirror.com, then pack. - Gate the re-download on the dist check so an unrelated build failure (tsc/vite) doesn't trigger a pointless ~200 MB refetch, and skip the final pack when the binary still can't be fetched instead of failing the same way. * test(desktop): cover Electron dist re-download mirror fallback (#47266) Add behavior coverage for the electronDist re-download fix: - _electron_dist_ok across linux/win32/darwin, including the partial-dist case (dir present but binary missing) that makes the pinned electronDist fail. - _redownload_electron_dist: no-op when the binary is present, bail when install.js is absent, wipe a stale dist + path.txt marker and run electron's downloader with ELECTRON_MIRROR injected, and report failure when the download still produces no binary. - `hermes desktop`: the mirror fallback now drives electron's own downloader before re-running pack, and skips the final pack entirely when the binary can't be fetched. Replaces the old mirror test that asserted the (now-fixed) dead behavior of re-running `npm run pack` with ELECTRON_MIRROR set — pack never downloads Electron under the pinned electronDist, so that retry could never help.	2026-06-16 15:40:55 -05:00
liuhao1024	1b962f001e	fix(models): pass model.base_url to fetch_models in /model picker The /model interactive picker resolved a base_url from user credentials but never passed it to ProviderProfile.fetch_models(), causing the picker to always query the provider's hardcoded default endpoint instead of the user's custom URL (e.g. a company litellm proxy). - providers/base.py: add optional base_url parameter to fetch_models() - hermes_cli/models.py: pass resolved base_url to fetch_models() - Update all subclass overrides for signature compatibility - Add 6 regression tests covering override, fallback, and integration	2026-06-16 13:09:40 -07:00
chimpera	1039e90b5e	fix(model-switch): probe /v1/models for providers without api_key Section 3 of list_authenticated_providers (user-defined endpoints from the providers: config section) required an api_key before probing the endpoint's /v1/models for live model discovery. This broke local self-hosted backends (llama.cpp, Ollama, vLLM, etc.) that don't require authentication — they would only ever show the single default_model from config instead of the full model catalog. Section 4 (custom_providers list) already handled this correctly with the policy: probe when api_key is set OR when no explicit models are configured. Apply the same logic to Section 3 so local backends get full model discovery without requiring a placeholder api_key workaround. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-16 13:07:52 -07:00
cyb0rgk1tty	b7fa62c530	fix(inventory): keep user-defined custom providers in model dedup The #45954 model-dedup builds `user_models` from every is_user_defined row, then strips those model IDs from every row where is_aggregator(slug) is True. But is_aggregator() returns True for every `custom:*` slug, and list_authenticated_providers emits named custom providers with slug `custom:<name>` and is_user_defined=True. So a user's own custom provider is treated as an aggregator and filtered against user_models — which holds exactly its own models (the row helped build that set). Every model is removed, the row drops to zero, and the provider disappears from the model picker. Guard the dedup loop to skip is_user_defined rows: a user's configured provider is never an aggregator duplicate of itself. Built-in aggregators (openrouter, etc.) are still deduped as before. Adds a regression test.	2026-06-16 13:04:07 -07:00
Jaaneek	bbc842d31e	feat(xai): default to grok-build-0.1 Switch the default model for the xAI/Grok provider and the xAI web search backend from grok-4.3 to grok-build-0.1. grok-build-0.1 is already recognized by the model metadata, so no new model definition is required; grok-4.3 remains selectable.	2026-06-16 11:50:17 -07:00
kshitij	8fa562a399	Merge pull request #47391 from kshitijk4poor/feat/add-glm-5.2 feat: add z-ai/glm-5.2 to OpenRouter and Nous model lists	2026-06-17 00:02:05 +05:30
Wolfram Ravenwolf	f6a42b1acf	feat(prompt): make context-file truncation limit configurable PROBLEM: Automatic context files such as SOUL.md and AGENTS.md were capped by a hardcoded CONTEXT_FILE_MAX_CHARS value. Amy's local fork had raised that constant from 20K to 25K so a larger SOUL.md would not be silently truncated, but the hardcoded 25K value changed upstream default behavior and made the patch less generally useful. SOLUTION: Restore the upstream-compatible 20K default, add a context_file_max_chars config setting for users who intentionally keep larger identity/project-context files, keep chat-visible truncation warnings, and document the new setting. Tests cover the default, config override, explicit max_chars precedence, and the warning text.	2026-06-16 11:28:35 -07:00
kshitijk4poor	b2da39a0f3	feat: add z-ai/glm-5.2 to OpenRouter and Nous model lists Z.ai released GLM 5.2 on 2026-06-15, available on OpenRouter: - https://openrouter.ai/z-ai/glm-5.2 GLM-5.2 is Z.ai's flagship for long-horizon tasks, shipping a 1M-token context window (up from 200K on GLM 5.1) and tool calling. Per the OpenRouter API: text-only, context_length 1048576, tools supported. No separate -fast variant exists. The 1M context length, native zai picker entry, setup wizard, and Z.ai coding-plan auth entries for glm-5.2 already landed on main. This fills the remaining gap: the two aggregator surfaces where glm-5.1 appears but glm-5.2 did not. Changes: hermes_cli/models.py - Add z-ai/glm-5.2 to the OpenRouter fallback snapshot (OPENROUTER_MODELS) and the Nous Portal curated list (_PROVIDER_MODELS["nous"]), newest flagship first. Live catalogs surface it automatically when reachable; the fallback lists matter when the manifest fetch fails. website/static/api/model-catalog.json - Regenerated via scripts/build_model_catalog.py (not hand-edited) so the manifest stays in sync with the source lists; guarded by tests/hermes_cli/test_model_catalog.py.	2026-06-16 23:35:45 +05:30
kshitij	17251e865b	Merge pull request #46857 from liuhao1024/fix/model-picker-merge-live-static fix(models): merge live API results with curated static catalog in generic provider path	2026-06-16 23:30:34 +05:30
kshitijk4poor	658ac1d866	fix(models): keep curated-first ordering in live+curated merge; use pure-catalog helper in validation The generic live+curated merge (commit `630b438`) seeded the merged list from live results, demoting curated-only models below live ones. That regressed #46309, which deliberately surfaces the newest curated model (kimi-k2.7-code) FIRST in the native picker even when the live /models listing lags. Restore curated-first ordering: curated entries lead (in catalog order), live-only entries are appended for discovery. This keeps the #46850 fix (zai glm-5.2 now appears) without the kimi regression. Also switch the validate_requested_model curated fallback (commit `ee7b8a4`) from provider_model_ids() — which triggers a second, uncached live /models fetch with its own 8s timeout and may resolve different credentials than the api_key/base_url just probed — to the pure-catalog helper _model_in_provider_catalog(). Membership is checked against the shipped catalog only, with no extra network call. Tests: restore the curated-first assertion in test_kimi_coding_live_catalog_does_not_hide_curated_k2_7_code; update the new merge tests to curated-first semantics; de-circularize the validation fallback tests to patch _PROVIDER_MODELS (the real source) instead of mocking the function under test.	2026-06-16 23:25:07 +05:30
brooklyn!	c6e99ab375	Merge pull request #46959 from NousResearch/bb/composer-model-selector feat(desktop): composer model selector, per-model presets & external-provider disconnect	2026-06-16 09:55:57 -05:00
Teknium	4d470b3dbb	fix(slack): route /debug via /hermes to restore Telegram-parity (#47248 ) Slack caps apps at 50 slash commands and the registry is at that ceiling, so adding /debug clamped it out of the native list and broke the telegram-parity test (debug on Telegram, absent from Slack native slashes, in neither exclusion set). Add 'debug' to _SLACK_VIA_HERMES_ONLY — same treatment credits already gets. /debug stays native on CLI/TUI/Telegram/Discord and reachable via /hermes debug on Slack.	2026-06-16 06:20:01 -07:00
MrDiamondBallz	9a59ad73dd	fix(auth): preserve Codex pool-only rate-limit state Classify exhausted pool-only openai-codex credentials as quota/rate-limited instead of missing auth. This prevents auth status and runtime credential resolution from reporting missing credentials when a valid manual:device_code pool credential exists but is temporarily in a 429 usage-limit cooldown. Adds regression coverage for pool-only Codex auth status and runtime resolution.	2026-06-16 05:56:11 -07:00
teknium	6373aba80f	feat(gateway): rename to tool_progress_grouping, add config/docs/tests Follow-up to salvaged PR #41620: - Rename tool_progress_style -> tool_progress_grouping (clearer intent) - Add display.tool_progress_grouping to DEFAULT_CONFIG (accumulate default) - Document in messaging docs incl. 'separate is noisier, only where progress enabled' - Add resolver tests (default/global/override/invalid/case)	2026-06-16 05:49:24 -07:00
teknium	98ae28657f	feat(display): document and test memory_notifications setting Follow-up to salvaged PR #4684: - Add display.memory_notifications to DEFAULT_CONFIG (off\|on\|verbose, default on) - Document the setting in docs/user-guide/features/memory.md - Add resolver tests for off/on/verbose memory + skill paths	2026-06-16 05:45:40 -07:00
Teknium	a6364bfa08	fix(telegram): edit streamed previews in place as rich (Bot API 10.1) (#46890 ) Streamed Telegram replies that finalize through editMessageText were converted to MarkdownV2, which has no table syntax and rewrites pipe tables into bullet lists — users saw a table while streaming that collapsed to a list at the last moment. Finalize now edits the existing preview IN PLACE via Bot API 10.1's editMessageText rich_message parameter when the content has constructs the legacy path degrades (tables, task lists, <details>, block math). No fresh send + delete, so no duplicate-preview flicker — the reason #46206 reverted the fresh-final re-send path. prefers_fresh_final_streaming stays False; the in-place edit replaces it. - _needs_rich_rendering(): rich reserved for table/task-list/details/math (adapted from #45995, @YonganZhang); plain replies stay on MarkdownV2. - _try_edit_rich(): editMessageText + rich_message via do_api_request, mirroring _try_send_rich's fallback/latch/transient contract. - edit_message finalize tries rich in place before the 4,096 overflow pre-flight (rich cap is 32,768), falling back to legacy on rejection. - rich_messages default flipped back to True (DEFAULT_CONFIG + adapter). - docs (en + zh-Hans) + cli-config example updated to default-on. Closes the root cause behind #45911 / #46009.	2026-06-16 05:26:04 -07:00
liuhao1024	ee7b8a4672	fix(models): validate_requested_model falls back to curated catalog when live API omits model When live /v1/models responds but omits a model that exists in the curated static catalog, validate_requested_model now accepts it with a note instead of rejecting. This covers the /model slash-command path (the picker path was already fixed in the parent commit). Addresses review feedback from potatogim on #46857.	2026-06-16 16:24:11 +08:00
liuhao1024	630b43892d	fix(models): merge live API results with curated static catalog in generic provider path When a provider's live /v1/models endpoint returns a stale or incomplete list (e.g. Z.AI missing glm-5.2), the generic profile-based code path returned only the live results, silently dropping curated models. Generalize the kimi-coding merge pattern to all providers: live entries come first (provider's preferred order), then curated-only entries are appended with case-insensitive dedup. This ensures models that the live endpoint omits still appear in /model picker. Fixes #46850	2026-06-16 16:21:01 +08:00
Brooklyn Nicholson	a0ec4f52b9	feat(desktop): disconnect external (CLI-managed) providers External providers (Claude Code) store creds outside Hermes, so the disconnect API refuses them. The backend now hands the GUI a per-OS `disconnect_command` that clears the credential the same way the CLI's logout does (macOS Keychain entry + ~/.claude/.credentials.json), and the misleading "use claude setup-token" hint is corrected. Settings → Providers offers a Disconnect button for these: it confirms, leaves Settings, and runs the removal command in the embedded terminal via a new runInTerminal() (queues onto $terminalInjection; the terminal pane flushes and clears it once its session is live). The expanded list also gets its own "Other providers" header so it no longer reads as grouped under "Connected". API-managed providers keep the one-click (trash) disconnect.	2026-06-16 00:08:21 -05:00
brooklyn!	c6b0eb4de0	fix(desktop): open remote-gateway artifacts via authenticated download (#46895 ) Some checks failed Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run Details Typecheck / typecheck (apps/desktop) (push) Waiting to run Details Typecheck / typecheck (apps/shared) (push) Waiting to run Details Typecheck / typecheck (ui-tui) (push) Waiting to run Details Typecheck / typecheck (web) (push) Waiting to run Details Typecheck / desktop-build (push) Waiting to run Details Docker / shell lint / Lint Dockerfile (hadolint) (push) Has been cancelled Details Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Has been cancelled Details OSV-Scanner / Scan lockfiles (push) Has been cancelled Details uv.lock check / uv lock --check (push) Has been cancelled Details On a remote gateway connection, agent-written files live on the gateway host, not the desktop's disk, so the Artifacts view's file:// hrefs failed ("Invalid external URL") and image thumbnails broke. Make mediaExternalUrl() remote-aware in one place: in remote mode it rewrites gateway-local paths to GET /api/files/download (a new endpoint that streams the file as a Content-Disposition: attachment). The artifacts view now resolves through it, and so do the existing chat-media and generated-image callers, for free. The download endpoint stays auth-gated; auth_middleware additionally accepts the session token as a ?token= query param for this one path so a shell/browser-opened download (which can't set the session header) still authenticates — the same query-token tradeoff as the /api/pty WebSocket. It is NOT added to PUBLIC_API_PATHS. Salvages #46663 (which carried ~19k lines of CRLF noise and made the endpoint public). Reimplemented on a clean LF base with the security hole closed and tests added. Co-authored-by: qingshan89 <qs2816661685@gmail.com>	2026-06-15 23:50:19 -05:00
Gille	0441b7f19f	fix(desktop): route global remote profile REST calls (#47011 ) * fix(desktop): route global remote profile REST calls * fix(dashboard): scope oauth provider routes by profile * test(tui): isolate notification poller queue	2026-06-15 23:24:55 -05:00
Shannon Sands	7cd71de1f4	Simplify dashboard update detection to containers	2026-06-15 20:08:39 -07:00
Shannon Sands	b1d6a57883	Detect containerized dashboard update management	2026-06-15 20:08:39 -07:00
Shannon Sands	0b6b29a30c	Hide hosted dashboard update controls	2026-06-15 20:08:39 -07:00
Teknium	c66ecf0bc3	feat(delegation): async background subagents via delegate_task(background=true) (#40946 ) * feat(delegation): async background subagents via delegate_task(background=true) delegate_task(background=true) dispatches a subagent that runs in the background and returns a handle immediately, so the user and model keep working while it runs. The full result — plus the original task source — re-enters the conversation as a new turn when the subagent finishes, riding the same completion-queue rail as terminal background processes. - tools/async_delegation.py: daemon-executor registry, capacity cap, rich self-contained completion event pushed onto the shared process_registry.completion_queue (type='async_delegation'). - delegate_tool.py: background param + single-task dispatch branch; batch async rejected (v1). - process_registry.py: format_process_notification renders the rich task-source block (goal/context/toolsets/model/status/result). - gateway/run.py: dedicated _async_delegation_watcher drains + injects results into the originating session (idle + post-turn), session_key routing enrichment, shutdown interrupt of dangling delegations. - config: delegation.max_async_children (default 3). Reuses the existing idle-drain wiring rather than mutating a running agent loop, preserving message-role alternation and prompt-cache invariants. 13 targeted tests; CLI + gateway paths E2E-verified. * test(delegation): make async non-blocking tests environment-independent CI 'test (5)' flaked on a cold, 8-worker runner: the first delegate_task(background=true) call measured 2.27s of one-time setup (config load + child-agent construction + imports), tripping the elapsed < 1.0 wall-clock assertion. That assertion was testing setup overhead, not blocking. Replace the wall-clock thresholds with the real invariant: dispatch returns while the child is still gated (active_count == 1, completion queue empty), which a synchronous impl could not do. Keep only a loose 4s sanity backstop well under the runner's 5s gate. * fix(delegation): harden async background delegation Follow-up review fixes: - Detach background child from parent._active_children at dispatch — otherwise parent-turn interrupts (Ctrl+C, mid-turn steering), cache evicts (release_clients), and session close (/new) kill/close the detached subagent mid-run, defeating the point of background mode. Lifecycle is owned by the async registry's interrupt_fn. - Make the capacity check atomic with the record insert (TOCTOU: two concurrent dispatches could both pass active_count() and exceed the cap). - TUI dedup: key async_delegation events by delegation_id — the fallthrough keyed them all as ("", type), suppressing every completion after the first in the desktop/TUI status feed. - CLI /stop now interrupts running background delegations and /agents lists them (they live outside the process registry and were invisible). - Drop stray unbalanced ']' line from the re-injection block and the unused _ASYNC_DEFAULT import. Tests: detach-at-dispatch + concurrent-capacity race added (15 total in test_async_delegation.py); 137 delegate + 140 process-registry/notify/watch + 7 TUI dedup tests pass. * fix(delegation): harden async background completion drains	2026-06-15 13:33:12 -07:00
xxxigm	b2a4766463	fix(dump): report effective terminal backend in `hermes debug` `terminal.backend` in config.yaml is bridged to the TERMINAL_ENV env var, but a TERMINAL_ENV set in .env / the shell overrides config and is what terminal_tool actually uses. The dump printed only the config value, so a user whose agent was jailed in a docker/podman sandbox via a stale TERMINAL_ENV still saw `terminal: local` — hiding the real cause. Report the effective backend and flag when TERMINAL_ENV overrides config.yaml.	2026-06-15 12:31:23 -07:00
liuhao1024	60cc42e38b	fix(inventory): deduplicate models between user-defined and aggregator providers When a user-defined provider (e.g. litellm-proxy) and an aggregator (e.g. openrouter) both advertise the same model name, the Desktop/TUI model picker would show the model under both groups. Selecting it from the aggregator row silently set model.provider to the aggregator, breaking calls because the aggregator doesn't actually serve that model ID. Fix: after list_authenticated_providers() returns, collect all models from user-defined provider rows and filter them out of aggregator rows. Uses is_aggregator() from hermes_cli/providers.py to identify aggregators. Case-insensitive matching. Fixes #45954	2026-06-15 12:25:41 -07:00
liuhao1024	9df1a1a8de	fix(doctor): recognize nvidia as vendor-slug-accepting provider NVIDIA NIM API uses vendor-prefixed model IDs (e.g. qwen/qwen3.5-122b-a10b, nvidia/nemotron-3-super-120b-a12b). The doctor command incorrectly warns that vendor-prefixed slugs belong to aggregators like openrouter when nvidia is the configured provider. Add 'nvidia' to the providers_accepting_vendor_slugs set so doctor no longer raises false-positive warnings for valid NVIDIA NIM configurations. Fixes #35425	2026-06-15 12:24:46 -07:00

1 2 3 4 5 ...

2814 commits