hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-20 15:33:54 +00:00

Author	SHA1	Message	Date
Teknium	db489a315f	fix(tests): allowlist tmp_path for kanban_notify artifact delivery (#30852 ) `_deliver_kanban_artifacts` routes candidates through `BasePlatformAdapter.filter_local_delivery_paths` (added in `41d2c758c`), which rejects paths outside `MEDIA_DELIVERY_SAFE_ROOTS`. The two artifact-delivery tests create fixtures under `tmp_path`, which lives outside the cache roots — so under CI's hermetic HOME the filter silently dropped both fake files and the assertions on `images_uploaded` / `documents_uploaded` failed. Fix: monkeypatch `HERMES_MEDIA_ALLOW_DIRS=str(tmp_path)` in both tests so the safety filter accepts the fixtures. Production behaviour unchanged; test-side fix only. CI fail repro on origin/main: test (6) shard, both test_notifier_uploads_artifacts_on_completion and test_notifier_artifact_delivery_skips_missing_files.	2026-05-23 02:34:34 -07:00
xxxigm	5b6f0b695b	test(tls-fd-recycle): pin shutdown-only + thread-aware close contract (#29507 ) Ten regressions across both prongs of the #29507 fix, organised so each test names exactly which way the bug could come back: Prong 1 — ``force_close_tcp_sockets``: * ``shutdown_only_no_close`` is the smoking-gun assertion. If a future refactor adds back ``sock.close()`` to this helper, the FD-recycling race that wrote TLS bytes on top of ``kanban.db`` is back, and this trips. * ``uses_shut_rdwr`` pins that both halves are shut down (a half-close wouldn't unblock a worker stuck in ``recv``). * ``swallows_oserror_on_shutdown`` covers the already-shutdown case. * ``handles_multiple_pool_entries`` walks all pool connections. Prong 2 — thread-aware ``_close_request_client_once``: * ``stranger_thread_aborts_only_no_close`` simulates the asyncio_0 → Thread-1616 interrupt path: stranger drives abort, holder stays populated for the worker's eventual finally. * ``owner_thread_pops_and_full_close`` is the worker-thread path: pops + full close. * ``stranger_then_owner_close_sequence_runs_full_close_exactly_once`` replays the reporter's exact timeline at object level: abort runs once, full close runs once, holder ends empty. Agent surface: * ``_abort_request_openai_client_does_not_call_client_close`` pins that the new entrypoint shuts sockets and emits the ``deferred_close=stranger_thread`` marker but never calls ``client.close()``. * ``_abort_request_openai_client_null_client_is_noop`` defensive. End-to-end: * ``fd_recycle_window_closed_by_shutdown_only`` reproduces the race at object level — runs the abort path from a stranger thread and asserts that no ``close()`` ever fires, so the kernel can never recycle the FD under the owner's still-active reference.	2026-05-23 02:31:10 -07:00
xxxigm	30c22f1158	fix(api-call): defer client.close() to owning worker thread on interrupt (#29507 ) Layer-2 defense for the FD-recycling race: even with ``force_close_tcp_sockets`` reduced to shutdown-only, the followup ``client.close()`` in ``_close_openai_client`` still walks the httpx pool and closes sockets — and if called from a stranger thread (the interrupt-check loop, the stale-call detector) it has the same FD-recycling exposure that wrote a TLS record on top of ``kanban.db``. Stamp the request_client_holder with the owning thread's ident at ``_set_request_client`` time. In ``_close_request_client_once``: * Owning thread (the worker's ``finally``) → pop + ``client.close()`` via ``_close_request_openai_client``, exactly as before. * Stranger thread → ``_abort_request_openai_client`` (new): only ``shutdown(SHUT_RDWR)`` the pool sockets and log a deferred-close marker. The holder stays populated so the worker's eventual ``finally`` performs the real close from its own thread context, where the FD release races nothing. Applied symmetrically to both the non-streaming ``interruptible_api_call`` and the streaming variant — both routinely get hit by stranger-thread interrupts. The log field ``tcp_force_closed=N`` keeps its existing shape; the new abort path adds ``deferred_close=stranger_thread`` so production triage can distinguish the two close kinds.	2026-05-23 02:31:10 -07:00
xxxigm	e2a7d73a66	fix(force_close_tcp_sockets): shutdown only, do not release FD (#29507 ) The helper used to call ``socket.shutdown(SHUT_RDWR)`` followed by ``socket.close()`` to drop CLOSE-WAIT entries immediately. On its own ``shutdown()`` is safe from any thread — it only sends FIN and breaks pending ``recv``/``send`` — but ``close()`` releases the FD integer to the kernel. When the helper runs on a stranger thread (the interrupt loop, the stale-call detector) the FD release races the owning httpx worker thread that still has the same integer cached inside the SSL BIO. The kernel then recycles that integer to the next ``open()`` call — in production, kanban dispatcher's ``kanban.db`` — and the worker's delayed TLS flush writes a 24-byte TLS application-data record on top of the SQLite header. Restrict the helper to ``shutdown(SHUT_RDWR)`` only. The owning httpx worker's own unwind will close the underlying socket via the same Python ``socket.socket`` object, which atomically swaps ``_fd`` to -1 before issuing ``close(2)`` — no FD-aliasing window. The log field ``tcp_force_closed=N`` is kept (now counts shutdowns) so existing dashboards / log parsers keep working.	2026-05-23 02:31:10 -07:00
sprmn24	53cb6d32be	fix(agent): use atomic_json_write for request debug dumps instead of bare write_text	2026-05-23 02:30:57 -07:00
sprmn24	b183be95a2	fix(gateway-windows): atomic write for .cmd and startup launcher scripts	2026-05-23 02:30:41 -07:00
walli	60b0a0e006	fix(qqbot): fix SILK magic byte detection slice length _guess_ext_from_data: data[:5] == b"#!SILK" -> data[:6] (6-byte string) _looks_like_silk: data[:4] == b"#!SILK" -> data[:6] The previous slices were too short to ever match the 6-byte "#!SILK" literal, relying entirely on the "#!SILK_V3" (9-byte) and 0x02! (2-byte) fallback paths for SILK format detection.	2026-05-23 02:27:17 -07:00
walli	0e7448d63a	fix(qqbot): use original attachment filename for cached files Add original_name parameter to _download_and_cache, preferring the attachment metadata filename over the CDN URL path basename. Previously files were cached with meaningless QQ CDN hash names (e.g. qqdownload_...oadftnv5), causing ugly filenames when sent back to users. Aligns with qqbot-agent-sdk's AttachmentDownloader.download_document.	2026-05-23 02:27:17 -07:00
walli	a54f5afc70	fix(qqbot): handle op 7/9 and expand fatal close code set 1. Handle op 7 (Server Reconnect): close WS to trigger reconnect loop while preserving session for Resume 2. Handle op 9 (Invalid Session): check d value to determine if session is resumable; clear session only when not resumable 3. Remove 4009 from session-clearing set (connection timeout is resumable) 4. Expand fatal close codes: 4001/4002/4010-4014 now stop reconnect immediately instead of retrying uselessly 5. Add unit tests	2026-05-23 02:27:17 -07:00
walli	bbd77d165c	fix(qqbot): add INTERACTION intent and expose video/file cached paths 1. Add INTERACTION intent bit (1<<26) to _send_identify, fixing approval button clicks not being received (INTERACTION_CREATE events were never dispatched by the gateway) 2. Include local cached path in video/file attachment descriptions so the LLM can reference files for re-sending to users 3. Add unit tests (TestIdentifyIntents, TestProcessAttachmentsPathExposure)	2026-05-23 02:27:17 -07:00
teknium1	66d81f9e14	fix(gateway): don't swallow expansion errors in runtime config helper A bare except in _load_gateway_runtime_config would silently return the unexpanded dict on any _expand_env_vars failure — masking the very bug this helper exists to fix. Drop it; let the caller see real errors.	2026-05-23 02:27:08 -07:00
QuenVix	2362cc4688	fix(gateway): enforce env variable template expansion on runtime config loaders	2026-05-23 02:27:08 -07:00
QuenVix	d21ac579e9	fix(gateway): honor key_env in auth-failure fallback resolution	2026-05-23 02:25:53 -07:00
Teknium	99671a8634	test(kanban): allow tmp_path artifacts past media-delivery validator PR #`41d2c758c` ("Fix unsafe gateway media path delivery") tightened `validate_media_delivery_path` so that artifacts emitted by the agent must live inside `MEDIA_DELIVERY_SAFE_ROOTS` (Hermes-managed cache dirs) or an operator-allowlisted root via `HERMES_MEDIA_ALLOW_DIRS`. Two kanban-notifier tests put their PDFs and PNGs under pytest's `tmp_path`, which is correctly rejected by the new validator. They started failing on main as soon as that PR landed: FAILED tests/hermes_cli/test_kanban_notify.py::test_notifier_uploads_artifacts_on_completion FAILED tests/hermes_cli/test_kanban_notify.py::test_notifier_artifact_delivery_skips_missing_files Symptom in logs: "Skipping unsafe local file path outside allowed roots". The validator is doing exactly what it should — the tests were relying on the looser pre-fix behaviour. Fix: add `HERMES_MEDIA_ALLOW_DIRS=tmp_path` to the `kanban_home` fixture so artifacts under `tmp_path` are recognised as safe. This is the same allowlist mechanism the operator-facing env var documents.	2026-05-23 02:25:09 -07:00
Teknium	5772e638c9	chore: drop in-repo infographic/ directory; keep PR-body URLs only (#30854 ) PR infographics belong in PR descriptions, not committed to the repo. Removes the 13 archived directories under infographic/ and adds the path to .gitignore so future generations don't accidentally land in-tree. The fal.media URLs embedded in each PR's body remain the canonical artifact — those PR descriptions are the storage.	2026-05-23 02:25:03 -07:00
sprmn24	b2e6fdd3bf	fix(agent): log warning when fallback model normalization fails instead of silently swallowing	2026-05-23 02:23:24 -07:00
teknium1	70aaa774be	fix(opencode-go): emit Kimi reasoning_effort, match KimiProfile shape The Kimi K2 branch added in the prior commit only emitted extra_body.thinking and dropped reasoning_effort entirely. KimiProfile (api.moonshot.ai/v1) sends both fields, and OpenCode Go proxies to the same Moonshot backend. Mirror that shape on the Go path so /reasoning effort actually reaches Kimi. - low/medium/high pass through verbatim - xhigh/max clamp to high (Moonshot's max supported value) - minimal / unknown effort → omit reasoning_effort, keep thinking on - disabled / no config → unchanged - DeepSeek branch unchanged	2026-05-23 02:20:28 -07:00
Harish Kukreja	3589960e03	fix(provider): expose OpenCode Go reasoning controls	2026-05-23 02:20:28 -07:00
helix4u	71291d83cd	test: keep tirith checks hermetic	2026-05-23 02:20:14 -07:00
QuenVix	52a368fa72	fix(gateway): preserve WhatsApp pairing approvals across JID/LID alias flips	2026-05-23 01:46:34 -07:00
Teknium	3127a41cb1	test(acp): pin parse_model_input in slash-command tests The two ACP slash-command tests that exercise `provider:model` routing (`test_set_session_model_accepts_provider_prefixed_choice` and `test_model_switch_uses_requested_provider`) relied on the live `hermes_cli.models._KNOWN_PROVIDER_NAMES` / `_PROVIDER_ALIASES` module state to parse `anthropic:claude-sonnet-4-6` into `("anthropic", "claude-sonnet-4-6")`. If any earlier test in the same xdist worker registers a custom provider that shadows `anthropic` or otherwise mutates those globals, the parser falls into the `detect_provider_for_model` branch and resolves to `custom` instead. Observed once in CI on run 26326728502 / job 77505732299 as `AssertionError: assert 'custom' == 'anthropic'` — could not reproduce locally under per-file isolation, so the failing in-file order was specific to a particular xdist scheduling. Monkeypatching `parse_model_input` + `detect_provider_for_model` for both tests removes the global-catalog dependency, so the tests now only exercise what they were written to verify (the `requested_provider -> runtime -> AIAgent kwargs` plumbing).	2026-05-23 01:44:56 -07:00
xxxigm	6a2df9f451	docs(env): clarify HERMES_ENABLE_PROJECT_PLUGINS contract (#29156 ) The reference entry now documents the truthy set (``1`` / ``true`` / ``yes`` / ``on``) explicitly, matches the falsy half (``0`` / ``false`` / ``no`` / ``off`` / empty string) that the GHSA-5qr3-c538-wm9j fix re-aligned both the agent loader and the dashboard web server around, and points readers at the defence-in-depth rule that project plugins never have their Python ``api`` file auto-imported by the dashboard regardless of the env var.	2026-05-23 01:43:52 -07:00
xxxigm	8bf99227f0	fix(plugins): block plugin-api path traversal + project RCE (#29156 ) GHSA-5qr3-c538-wm9j — half two of the bypass chain. ``_mount_plugin_api_routes`` imports each dashboard plugin's manifest ``api`` field as a Python module via ``importlib.util.spec_from_file_location`` — arbitrary code execution by design. Two primitives in the surrounding code turned that "by design" RCE into a usable attack: 1. Absolute paths in the manifest swallow the plugin directory. ``Path('safe/dashboard') / '/tmp/evil.py'`` resolves to ``/tmp/evil.py``, so a single manifest line ``{"api": "/tmp/payload.py"}`` was enough to redirect the importer at any Python file on disk. 2. ``..`` traversal in the manifest climbs out of the dashboard directory. ``Path('plugins/safe/dashboard') / '../../../tmp/evil.py'`` lands in ``/tmp/evil.py`` after ``resolve()`` — the static-asset handler (``serve_plugin_asset``) already defends against this via ``is_relative_to``; the api-mount path didn't. Fix at three layers so a regression in any one can't re-open the advisory: * New ``_safe_plugin_api_relpath`` validator runs at discovery time and stores only sanitised relative paths on the plugin entry's ``_api_file`` field. Absolute paths, ``..`` traversal, empty / non-string values, and paths that ``resolve()`` outside the plugin's ``dashboard/`` directory are rejected with a warning naming the plugin. ``has_api`` follows the sanitised value so the dashboard frontend doesn't render a fake "Backend API" badge for plugins whose api was scrubbed. * ``_mount_plugin_api_routes`` re-validates the resolved path against the live filesystem just before the import — defence in depth in case ``_dir`` is tampered with post-cache or a future caller bypasses the discovery-time validator. * Project plugins (``source == "project"``) are refused outright for backend import. ``./.hermes/plugins/`` ships with the CWD, so any threat model that includes "user opens a malicious repo" treats it as attacker-controlled; project plugins can still extend the UI via static JS/CSS but their Python ``api`` is no longer auto-imported. Combined with the truthy env-gate fix from the previous commit, the original advisory chain now fails at two distinct choke points.	2026-05-23 01:43:52 -07:00
xxxigm	da636e982b	test(plugins): regression coverage for project-plugin RCE chain (#29156 ) 35 new tests across 5 classes covering every layer of the GHSA-5qr3-c538-wm9j defence. Each class corresponds to one chokepoint so a regression in any single layer is caught by the named class: * ``TestProjectPluginsEnvGate`` (13 cases) — parametrised over both the documented truthy values (``1`` / ``true`` / ``yes`` / ``on`` + uppercase variants) and the previously-bypassing falsy strings (``0`` / ``false`` / ``no`` / ``off`` / ``""`` / ``False``). The falsy half is the direct env-bypass repro: pre-fix any non-empty string enabled the project source. * ``TestApiPathSanitizer`` (16 cases) — unit-level coverage of the new ``_safe_plugin_api_relpath`` helper. Absolute paths (``/etc/passwd``, ``/tmp/payload.py``, ``/usr/bin/python``), ``..``-traversal payloads (including nested ``subdir/../../..``), and non-string / empty / whitespace-only values must all return ``None``. Safe relative paths (``api.py``, ``backend/routes.py``) round-trip unchanged so legitimate plugins keep working. * ``TestDiscoveryScrubsApiField`` (3 cases) — end-to-end through ``_discover_dashboard_plugins`` with a real manifest on disk. Verifies that the cached plugin entry's ``_api_file`` is scrubbed at discovery time (``None`` + ``has_api: False``) so any downstream consumer can't be tricked into re-deriving the unsafe path from cache. * ``TestMountApiRoutesRefusesUntrusted`` (3 cases) — pokes synthetic plugin entries with each refusal vector directly into the cache and patches ``importlib.util.spec_from_file_location`` to assert it is not invoked for project-source / traversal payloads, and is invoked normally for bundled / user plugins. * ``TestEndToEndPocBlocked`` (1 case) — reproduces the original advisory PoC: operator sets ``HERMES_ENABLE_PROJECT_PLUGINS=0`` believing project plugins are off, attacker plants a manifest in CWD's ``.hermes/plugins/`` with ``api`` pointing at an absolute payload path. Asserts that the importer is never called against the payload path and that ``hermes_dashboard_plugin_evil`` is not in ``sys.modules`` after the mount routine runs. An autouse fixture busts ``_dashboard_plugins_cache`` before and after each test so the production cache (populated by the import-time ``_mount_plugin_api_routes()`` call) can't bleed in. All 12 pre-existing dashboard-plugin tests in ``test_web_server.py`` still pass unchanged.	2026-05-23 01:43:52 -07:00
xxxigm	09f85f2cf7	fix(plugins): apply truthy env semantics to project-plugin gate (#29156 ) GHSA-5qr3-c538-wm9j — half one of the bypass chain. ``_discover_dashboard_plugins`` opted into the untrusted ``./.hermes/ plugins/`` source via ``if os.environ.get("HERMES_ENABLE_PROJECT_ PLUGINS"):`` — which is True for any non-empty string. ``=0``, ``=false``, ``=no``, ``=off`` all return non-empty strings and so enabled the project source even though every operator (and the agent loader, ``hermes_cli/plugins.py`` line 815) reads those values as "disabled". An attacker who can land a manifest under the CWD's ``.hermes/plugins/`` directory — a malicious cloned repo, a worktree checked out from a forked PR, a CI runner workspace — was therefore guaranteed to get their manifest discovered the moment the user ran ``hermes dashboard`` from that directory, regardless of whether the user thought they had project plugins disabled. Switch to the shared ``utils.env_var_enabled`` helper used by the agent loader so the gate accepts the documented truthy set (``1`` / ``true`` / ``yes`` / ``on``, case-insensitive) and treats everything else — including ``0`` / ``false`` / ``no`` — as off. Half two (path-traversal + project-source ``api`` import) lands in the next commit. Together they break the RCE chain at two distinct choke points so a future regression in either one alone can't re-open the advisory.	2026-05-23 01:43:52 -07:00
Teknium	11e6dd3c60	chore(release): add AUTHOR_MAP entry for egilewski (PR #30432 ) (#30833 )	2026-05-23 01:41:31 -07:00
Eugeniusz Gilewski	41d2c758c3	Fix unsafe gateway media path delivery	2026-05-23 01:40:35 -07:00
Markus	4a91e36495	fix(gateway): separate observed Telegram group context	2026-05-23 01:33:42 -07:00
Teknium	729a778af0	infographic: PR #17659 read-deny credentials salvage Some checks failed Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Docker Build and Publish / move-latest (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Has been cancelled Details Nix Lockfile Fix / fix (push) Has been cancelled Details	2026-05-22 20:15:09 -07:00
Teknium	97e975edd2	fix(file-safety): widen read-deny to .env, mcp-tokens/, webhook secrets, root Extends @briandevans's PR #17659 from {auth.json, auth.lock, .anthropic_oauth.json} to also cover: - HERMES_HOME/.env (provider API keys) - HERMES_HOME/webhook_subscriptions.json (per-route HMAC secrets) - HERMES_HOME/mcp-tokens/ (OAuth token directory; dir + everything inside) …AND iterates over both _hermes_home_path() AND _hermes_root_path() so profile-mode runs (HERMES_HOME = <root>/profiles/<name>) also block <root>/{auth.json, .env, mcp-tokens/, ...}. Same widening shape as the write-deny side already does (#15981, #14157). Explicitly NOT a security boundary. Per the personal-assistant trust model, the terminal tool runs as the same OS user and can `cat auth.json` directly. This read-deny exists as defense-in-depth: - Models that respect tool denials empirically tend to stop rather than reach for the shell. - The denial surfaces an audit trail when something tries to read credentials — easier to spot in logs than a generic `cat`. Docstring + error message both flag this as defense-in-depth so future contributors don't mistake it for a real security boundary and don't re-decline reports that propose the same fix shape. Absorbs the .env and mcp-tokens/ coverage from @tomqiaozc's parallel PR #8055 (closed-as-duplicate, credited). Co-authored-by: Tom Qiao <zqiao@microsoft.com>	2026-05-22 20:15:09 -07:00
briandevans	567ea61298	fix(file-safety): block auth.json read via TERMINAL_CWD relative path read_file_tool resolves relative paths against TERMINAL_CWD (or the task's live terminal cwd), but the prior call passed the original unresolved string to get_read_block_error. That function's own resolve() is anchored at the Python process cwd, so when a task's TERMINAL_CWD pointed at HERMES_HOME and the agent issued read_file on the relative path "auth.json", the credential-store denylist was never reached and the file was read normally. Pass the already-resolved absolute path string at the file_tools call site, document the contract on get_read_block_error, and add a read_file_tool-level regression test that pins the relative-path case under TERMINAL_CWD == HERMES_HOME. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 20:15:09 -07:00
briandevans	056e00a77e	fix(file-safety): block read_file on HERMES_HOME credential stores (#17656 ) `get_read_block_error` previously only denied reads inside `${HERMES_HOME}/skills/.hub`, which left `auth.json` (provider OAuth state + plaintext API keys) and `.anthropic_oauth.json` (Anthropic PKCE tokens) directly readable by the agent. A prompt-injection reaching `read_file` could exfiltrate active provider credentials in plaintext. Mode-0600 file permissions only protect against other Unix users — the agent runs as the file's owner, so `read_file` is unaffected. Extend the existing deny list with the three credential paths identified in #17656 (`auth.json`, `auth.lock`, `.anthropic_oauth.json`). The check uses the same `Path.resolve()` pattern as `skills/.hub`, so symlink/path-traversal indirection is caught too. The agent doesn't need to read these directly — `auxiliary_client` and `credential_pool` consume them through process env / OAuth flows that bypass `read_file`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 20:15:09 -07:00
Teknium	7f7245bf62	infographic: PR #6656 skill hub safety audit salvage	2026-05-22 19:59:24 -07:00
Teknium	3f78d8073c	fix(skills): make content_hash filename-sensitive too (symmetric with bundle_content_hash) PR #6656 added rel_path + \x00 prefixing to ``bundle_content_hash`` so a filename swap between two files in a bundle changes the digest. But it only patched the in-memory side — ``content_hash`` in ``tools/skills_guard.py`` (the on-disk equivalent) still hashed file contents only. These two functions need to stay symmetric: ``check_for_skill_updates`` compares the disk hash of an installed skill against the bundle hash of the upstream copy. With the asymmetric fix, every clean install showed as drifted because the digests no longer matched (2 existing tests in ``test_skills_hub.py`` started failing as soon as the contributor's change landed). Apply the same ``rel_path + \x00 + content`` shape to the disk-side function. Both functions now produce the same digest for the same skill content laid out two ways. Documented the symmetry invariant in the docstring so a future change to either function knows to touch both. Also adds tests/tools/test_pr_6656_regressions.py with 10 regression tests covering all three fixes salvaged in PR #6656: - uninstall_skill path traversal (4 cases: parent segments, absolute paths, symlink escape, legitimate skill) - bundle_content_hash filename swap detection (4 cases: in-memory swap, identity, disk-side swap, bundle↔disk symmetry) - list_pending lock contract (2 cases: source-grep contract, smoke) Also fixes AUTHOR_MAP entry for @aaronlab — their commit email (1115117931@qq.com) maps to "aaronagent" which isn't a real GitHub login, so changelog @mentions would 404.	2026-05-22 19:59:24 -07:00
aaronagent	b82608a6f5	fix(skills,pairing): path traversal guard in uninstall, lock list_pending, hash file paths - skills_hub: validate that uninstall_skill's install_path resolves inside SKILLS_DIR before calling shutil.rmtree, preventing recursive deletion of arbitrary directories via poisoned lock.json entries - skills_hub: include file paths (not just contents) in bundle_content_hash so swapping filenames between files changes the hash, strengthening update-detection integrity - pairing: wrap list_pending() in self._lock so _cleanup_expired() file writes don't race with concurrent generate_code()/approve_code() calls Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-22 19:59:24 -07:00
teknium1	8cf977c8b1	fix(plugins): widen _sanitize_plugin_name for category-namespaced names Follow-up to PR #28832 — the dashboard plugin routes now accept slashed names like `observability/langfuse` and `image_gen/openai`, but `_sanitize_plugin_name` still rejected forward slash and so dashboard update + remove on those plugins fell through to '404 not found' even though they exist on disk. Adds an opt-in `allow_subdir=True` flag that: - Permits internal forward slashes (category-namespaced plugin keys emitted by `_discover_all_plugins`). - Strips leading and trailing slashes. - Still rejects `..` and backslash, and still asserts the resolved target lives inside `plugins_dir`. Opted in at the two read-paths that operate on installed plugins: `_require_installed_plugin` (CLI update/remove) and `_user_installed_plugin_dir` (dashboard update/remove). The install path keeps the default (`allow_subdir=False`) because freshly-cloned plugins always land top-level under `~/.hermes/plugins/<name>/`. Adds 6 targeted unit tests covering the new flag's allow/reject matrix.	2026-05-22 19:50:32 -07:00
Austin Pickett	487c398dcf	refactor(web): dashboard typography & contrast pass Removes the global `uppercase` + `font-mondwest` from the App.tsx root that forced every page to opt-out, replaces stacked-alpha text colors with semantic tokens for WCAG-AA contrast across all 7 themes, and applies the new `text-display` utility from @nous-research/ui@0.16.0 on intentional brand chrome (page titles, sidebar headings, segmented filters) only. Bumps every sub-12px arbitrary text size to text-xs. Also widens the dashboard plugin routes (/api/dashboard/agent-plugins/ {name:path}/...) so category-namespaced plugins like observability/ langfuse and image_gen/openai can be enable/disabled from the dashboard — previously the FE encodeURIComponent-ed the slash and the backend {name} route rejected it. _validate_plugin_name still blocks .. and backslash, and strips leading/trailing slash. Touches sessions/env/keys page chrome and adds two new i18n keys (`overview`, `showMore`/`showLess`) across all 18 locales. Squashes 19 commits from PR #28832. Co-authored-by: Hermes <noreply@nousresearch.com>	2026-05-22 19:50:32 -07:00
ethernet	dc4b0465b5	feat(ci): use 6-way slicing based on benchmark results Benchmarked 4/5/6/7/8 slices with LPT duration-balanced distribution: - 4 slices: 4.8m wall, 135s spread - 5 slices: 3.4m wall, 46s spread - 6 slices: 3.3m wall, 26s spread ← optimal - 7 slices: 3.9m wall, 109s spread - 8 slices: 3.7m wall, 96s spread 6 slices is the sweet spot: lowest wall time, tightest spread. 7+ gets slower due to per-slice startup overhead dominating. Also removes benchmark branch markers from save-durations condition.	2026-05-22 19:46:18 -07:00
ethernet	e7cb5d4b68	fix: clean push triggers	2026-05-22 19:46:18 -07:00
ethernet	f89afdbd17	fix(test): deflake two intermittent CI failures - test_browser_secret_exfil: mock _run_browser_command instead of launching real Chrome (secret check is pre-launch, browser is irrelevant to the assertion) - test_web_server: add time.sleep(0.05) after pub.send_text() to yield the event loop before receive_text(). TestClient's sync mode can race the broadcast handler otherwise, hanging the test.	2026-05-22 19:46:18 -07:00
ethernet	510df6eaf4	test: 4-way slice benchmark (with cache save)	2026-05-22 19:46:18 -07:00
ethernet	b689624aee	feat(ci): 4-way matrix slicing with LPT duration-balanced distribution run_tests_parallel.py: - --slice I/N flag (also HERMES_TEST_SLICE env var) runs only the I-th slice of N, distributing files across slices by cached duration using LPT (Longest Processing Time first) greedy algorithm so each slice gets roughly equal wall time - Duration cache (test_durations.json): maps relative file paths to last-observed subprocess wall time. _save_durations merges with existing cache so entries from other slices are preserved. - Per-file subprocess timing in progress output + end-of-run distribution summary (percentiles, top-10 slowest, <1s/<2s counts) - Unknown files default to 2.0s estimate (~P50), spread evenly by LPT .github/workflows/tests.yml: - Matrix strategy: slice [1, 2, 3, 4] with fail-fast: false - Each slice restores duration cache from main (stable key, no SHA), runs its portion, uploads per-slice durations as artifacts - save-durations job (main only, if: always()) downloads all 4 artifacts, merges into single cache entry for future PRs - Timeout reduced from 60min to 30min per slice (~1/4 the work) Cache design: - Stable key (test-durations) not keyed by commit SHA — durations are about files, not commits, and SHA-keyed caches miss on every new commit and on PR merge commits - actions/cache scoping: main's cache is visible to all PRs targeting main; feature branches without a cache still work (default 2.0s) - No dotfile prefix (upload-artifact v7 skips hidden files)	2026-05-22 19:46:18 -07:00
Teknium	a84cec61ca	fix(minimax-oauth): refresh short-lived access tokens per request (#30619 ) * fix(minimax-oauth): refresh short-lived access tokens per request MiniMax OAuth issues ~15-minute access tokens. The Anthropic SDK caches api_key as a static string at client construction, so a session that resolves credentials once at startup keeps sending the same bearer until MiniMax returns 401 mid-session. Swap the static string for a callable token provider, reusing the existing Entra-ID bearer-hook infrastructure in build_anthropic_client. The callable re-reads auth.json on each invocation and calls _refresh_minimax_oauth_state, which is a no-op when the token still has more than 60s of life left and refreshes proactively otherwise. Refreshes persist to auth.json so other processes (gateway, cron) see them immediately. The wire-up lives at the agent-init / model-switch boundary rather than in resolve_runtime_provider, so aux client paths that hand the api_key string to OpenAI(api_key=...) are unaffected. * docs: add infographic for minimax-oauth token refresh	2026-05-22 15:16:15 -07:00
ethernet	2f320cb35a	fix(ci): supply-chain-audit uses two-dot diff, causing false positives on stale-branch PRs The workflow diffs base.sha..head.sha (two-dot), which compares the tip-of-main tree directly against the PR tip. When files land on main after a PR branched off, they appear in the diff even though the PR never touched them — triggering false-positive findings. Example: PR #30609 was flagged for hermes_cli/setup.py, a file added to main by an unrelated commit after the PR branched. Switch to three-dot diff (base.sha...head.sha), which diffs from the merge base to the PR tip — only changes introduced by this PR are included. Applied to all four diff commands in both jobs (scan and dep-bounds).	2026-05-22 15:15:53 -07:00
Teknium	2233b8b244	infographic: PR #30609 Termux cold-start salvage (#30618 )	2026-05-22 14:32:41 -07:00
adybag14-cyber	a3beee475b	perf(termux): speed up bare cli prompt startup	2026-05-22 14:27:38 -07:00
adybag14-cyber	6c3fd9714f	perf(termux): fast-path cli version startup	2026-05-22 14:27:38 -07:00
Teknium	d11cbb1032	infographic: PR #30591 Discord adapter → bundled plugin salvage (#30614 )	2026-05-22 14:24:03 -07:00
Teknium	7849a3d73f	fix(gateway,discord-plugin): _platform_status must respect is_connected=False, not silently fall back to check_fn Two bugs surfaced by PR #24356 migrating Discord into the registry: 1. plugins/platforms/discord/adapter.py::_is_connected — read DISCORD_BOT_TOKEN via hermes_cli.gateway.get_env_value (the abstraction tests patch) instead of os.getenv directly. The legacy non-registry path used get_env_value; bypassing it broke test_setup_openclaw_migration which patches gateway_mod.get_env_value to simulate a hermetic env. 2. hermes_cli/gateway.py::_platform_status — when entry.is_connected is defined and returns False, return 'not configured' immediately. Don't fall back to entry.check_fn(), which would let 'SDK is installed' override 'no token configured' and incorrectly report the platform as ready. The fallback to check_fn is the right behaviour only when is_connected is None (not registered). Fixes 5 test failures observed on CI for PR #24356: - tests/hermes_cli/test_setup.py::test_setup_gateway_skips_service_install_when_systemctl_missing - tests/hermes_cli/test_setup.py::test_setup_gateway_in_container_shows_docker_guidance - tests/hermes_cli/test_setup_irc.py::TestIRCGatewaySetupFreshInstall::test_setup_gateway_irc_counts_as_messaging_platform - tests/hermes_cli/test_setup_openclaw_migration.py::TestGetSectionConfigSummary::test_gateway_returns_none_without_tokens - tests/hermes_cli/test_setup_openclaw_migration.py::TestSetupWizardSkipsConfiguredSections::test_sections_skipped_when_migration_imported_settings Same _platform_status bug exists for sibling plugin platforms (teams, google_chat) whose check_fn returns true on SDK install alone; their tests just never exercised the registry path before. The bug only became test-visible when Discord migrated into the registry. Validation: 11,167 tests across tests/gateway/ + tests/cron/ + tests/tools/test_send_message_tool.py + tests/hermes_cli/ pass with zero failures.	2026-05-22 14:21:41 -07:00
kshitijk4poor	cc8e5ec2af	refactor(gateway): migrate Discord adapter to bundled plugin (full Teams parity) First migration of an existing built-in platform adapter to the plugin system established by IRC / Teams / LINE / Google Chat. Closes #24325; advances the umbrella refactor in #3823. Matches Teams' shape exactly — adapter under ``plugins/platforms/discord/`` with the standard ``__init__.py`` / ``adapter.py`` / ``plugin.yaml`` shell, ``register(ctx)`` entry point, no back-compat shim at the old import path, and full parity for the four hooks Teams uses plus the ``apply_yaml_config_fn`` hook that landed in #25443 (the Discord plugin is the first consumer of that hook): * ``standalone_sender_fn`` — out-of-process cron delivery via REST API * ``setup_fn`` — interactive ``hermes setup gateway`` wizard * ``apply_yaml_config_fn`` — translate ``config.yaml`` ``discord:`` keys into ``DISCORD_`` env vars (replaces the hardcoded block in ``gateway/config.py``) ``is_connected`` — declares connection state from ``DISCORD_BOT_TOKEN`` * ``check_fn`` — lazy-installs ``discord.py`` on demand * plus ``allowed_users_env``, ``allow_all_env``, ``cron_deliver_env_var``, ``max_message_length``, ``emoji``, ``required_env``, ``install_hint`` * ``gateway/platforms/discord.py`` (5,101 LOC) → ``plugins/platforms/discord/adapter.py`` (git rename, R090). * New ``plugins/platforms/discord/{__init__.py, plugin.yaml}`` with ``requires_env`` / ``optional_env`` declarations. * Append ``register(ctx)`` block + new hook implementations (``_standalone_send``, ``interactive_setup``, ``_apply_yaml_config``, ``_clean_discord_user_ids``, ``_is_connected``, ``_build_adapter``, plus helpers ``_DISCORD_CHANNEL_TYPE_PROBE_CACHE`` etc.) to the adapter. * Replace the ``Platform.DISCORD elif`` branch in ``GatewayRunner._create_adapter()`` (−9 LOC) with a generic post-creation hook (+6 LOC) in the registry path: any plugin adapter that declares a ``gateway_runner`` attribute now gets it auto-injected. Webhook's built-in branch is unchanged (it doesn't go through the registry path). * Move ``_send_discord`` (190 LOC) and helpers (``_DISCORD_CHANNEL_TYPE_PROBE_CACHE``, ``_remember_channel_is_forum``, ``_probe_is_forum_cached``, ``_derive_forum_thread_name``) from ``tools/send_message_tool.py`` into the plugin as ``_standalone_send``. * Wire via ``standalone_sender_fn=_standalone_send`` (Teams pattern; same gap fixed in #21804 for other plugin platforms). * Replace the Discord ``elif`` in ``tools/send_message_tool.py`` ``_send_to_platform`` with a 10-line registry-hook dispatch. * Drop the ``DiscordAdapter`` import and the ``Platform.DISCORD: DiscordAdapter.MAX_MESSAGE_LENGTH`` ``_MAX_LENGTHS`` entry — the registry's ``max_message_length=2000`` covers it. * Move ``_setup_discord`` and ``_clean_discord_user_ids`` (68 LOC) from ``hermes_cli/setup.py`` into the plugin as ``interactive_setup``. * Wire via ``setup_fn=interactive_setup``. CLI helpers (``prompt``, ``print_info``, etc.) are lazy-imported so the plugin's module-load surface stays minimal. * Remove ``"discord": _s._setup_discord`` from ``hermes_cli/gateway.py::_builtin_setup_fn``. * Remove the entire 32-line ``_PLATFORMS["discord"]`` static dict entry — Discord's setup metadata is now discovered dynamically via ``_all_platforms()`` from the registry entry. * Move the 59-line ``discord_cfg`` YAML→env bridge from ``gateway/config.py::load_gateway_config()`` into the plugin as ``_apply_yaml_config``. Covers ``require_mention``, ``thread_require_mention``, ``free_response_channels``, ``auto_thread``, ``reactions``, ``ignored_channels``, ``allowed_channels``, ``no_thread_channels``, ``allow_mentions.{everyone,roles,users, replied_user}``, and ``reply_to_mode`` (including the YAML 1.1 ``off``-as-False coercion and the ``extra.reply_to_mode`` fallback). * Wire via ``apply_yaml_config_fn=_apply_yaml_config``. * The hook runs BEFORE ``_apply_env_overrides`` and after the generic shared-key loop, exactly as documented in ``website/docs/developer-guide/adding-platform-adapters.md``. * Behavior is preserved exactly — every assignment still uses ``not os.getenv(...)`` guards so env vars take precedence over YAML. All 78 references to the old import path are rewritten — no back-compat shim: * 51 ``from gateway.platforms.discord import X`` → ``from plugins.platforms.discord.adapter import X`` * 5 ``import gateway.platforms.discord as discord_platform`` → ``import plugins.platforms.discord.adapter as discord_platform`` * 1 ``from gateway.platforms import discord as discord_mod`` → ``from plugins.platforms.discord import adapter as discord_mod`` * 21 ``mock.patch("gateway.platforms.discord.X")`` strings → ``mock.patch("plugins.platforms.discord.adapter.X")`` * 1 docstring reference in ``hermes_cli/commands.py`` * 1 import in ``tools/send_message_tool.py`` (now removed entirely) The import-safety test in ``tests/gateway/test_discord_imports.py`` is updated to purge the new canonical module name from ``sys.modules``. 38 files changed, +621 / −473 — net positive due to the YAML hook implementation (89 new LOC in the plugin trading for 59 deleted in core), but every line moved has a clear plugin home now. The git rename is detected at R090 because the adapter gained ~340 LOC of moved-in hook implementations (``_standalone_send`` + ``interactive_setup`` + ``_apply_yaml_config`` + helpers). * All 568 Discord-specific tests pass across 25 ``test_discord_.py`` files plus voice/send/text-batching/reload-skills/stream-consumer/ integration tests. All 147 tests in the YAML-touching subset (``test_discord_reply_mode``, ``test_discord_free_response``, ``test_discord_allowed_channels``, ``test_discord_allowed_mentions``, ``test_discord_channel_controls``, ``test_discord_reactions``, ``test_discord_thread_persistence``, ``test_runtime_footer``) pass — this is the strongest signal that the YAML→env hook behaves identically to the legacy block. * Broader gateway/cron/integration sweep (1297 tests) introduces zero new failures vs ``main``. Pre-existing failures in ``tests/gateway/test_tts_media_routing.py`` and ``tests/e2e/test_platform_commands.py`` reproduce identically on the unchanged ``main`` revision. * Plugin discovery sanity check confirms Discord registers alongside the other four platform plugins: Registered platforms: ['discord', 'google_chat', 'irc', 'line', 'teams'] These Discord-shaped tendrils in core were deliberately not moved — they are generic platform-registry concerns affecting every platform, not Discord-specific: * ``gateway/config.py:1205`` ``DISCORD_BOT_TOKEN → config.token`` env enablement — same shape Telegram has. The existing ``env_enablement_fn`` registry hook only seeds ``extra``, not ``.token``, so it can't replace this without an adapter refactor to read from ``extra["bot_token"]``. * ``gateway/run.py`` voice-mode hooks (``self.adapters.get(Platform.DISCORD)`` for ``start_voice_mode``/``stop_voice_mode``), role-based auth, ``DISCORD_ALLOW_BOTS`` branch in ``_is_user_authorized``, ``_UPDATE_ALLOWED_PLATFORMS`` frozenset, and the per-platform allowlist maps — generic platform-registry concerns. * ``Platform.DISCORD`` enum literal — stable identifier used as dict keys throughout the codebase; removing it is a separate refactor with no real benefit. * ``tools/discord_tool.py`` and ``tools/environments/local.py`` — first-class agent tools and env-passthrough config, neither is the gateway adapter. Each of these is worth its own scoping issue when the time comes.	2026-05-22 14:21:41 -07:00

1 2 3 4 5 ...

9258 commits