hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-13 14:02:16 +00:00

Author	SHA1	Message	Date
Teknium	cc93053b42	fix(xai-oauth): apply WKE disambiguator to recovery-path catch-all (#29344 ) _recover_with_credential_pool had a second classification site that blanket- treated any 403 against xai-oauth as entitlement (defense-in-depth for #26847). That override defeated the new _is_entitlement_failure disambiguator from the parent commit — bad-credentials 403s still short-circuited the refresh path. Apply the same WKE-unauthenticated / OAuth2-validation-phrase guard at the override site so xAI's authoritative 'this is auth, not entitlement' signal wins there too. The #26847 catch-all still triggers for genuine entitlement bodies that don't carry the disambiguator. Closes the end-to-end gap exposed by test_recover_with_credential_pool_refreshes_on_xai_bad_credentials_403.	2026-05-23 02:48:13 -07:00
xxxigm	b5ea6a5c80	test(xai-oauth): regression coverage for the bad-credentials disambiguator (#29344 ) Eleven new tests pinning the #29344 fix. Layout mirrors the existing "Fix D" entitlement section so the bad-credentials disambiguator sits alongside the entitlement-block tests it complements. Classifier-level coverage: * ``test_is_entitlement_failure_false_for_bad_credentials_wke_suffix`` — verbatim shape from the reporter's wire capture (``{code: 'caller does not have permission', error: 'OAuth2 access token could not be validated. [WKE=unauthenticated:bad-credentials]'}``) ↦ classifier must return False so the refresh path runs. * ``test_is_entitlement_failure_false_for_wke_suffix_in_normalized_shape`` — same body after ``_extract_api_error_context`` has rewritten it to ``{reason, message}``. The disambiguator must fire in BOTH shapes; without this guard the production call site at ``_recover_with_credential_pool`` (which goes through the normalised extractor) would still misclassify. * ``test_is_entitlement_failure_false_for_any_wke_unauthenticated_variant`` — parametrised forward-compat: ``bad-credentials``, ``expired-token``, ``revoked``, ``some-future-reason``. xAI documents the prefix as stable, the suffix after the colon as a reason code that can grow; every variant under ``unauthenticated:`` must route to refresh. * ``test_is_entitlement_failure_false_via_oauth2_validation_phrase_alone`` — belt-and-braces guard: if a future API revision drops the WKE suffix but keeps "OAuth2 access token could not be validated", we still classify correctly. * ``test_is_entitlement_failure_wke_signal_overrides_entitlement_keywords`` — defensive: if a body ever carries BOTH the WKE suffix and entitlement language, the WKE signal wins. Auth is recoverable; entitlement isn't, and a refreshed token will resurface the entitlement message on the next request. * ``test_is_entitlement_failure_case_insensitive_wke_match`` — pins that the classifier lowercases the haystack so a future xAI build that uppercases the prefix doesn't reintroduce the bug. Recovery-path coverage (end-to-end through ``_recover_with_credential_pool``): * ``test_recover_with_credential_pool_refreshes_on_xai_bad_credentials_403`` — the headline test the reporter requested: a bad-credentials 403 with the exact wire body must call ``try_refresh_current()`` exactly once and ``_swap_credential`` once. Pre-fix this returned ``(False, _)`` because the entitlement classifier over-matched and short-circuited the refresh path. * ``test_recover_with_credential_pool_still_blocks_real_entitlement`` — companion regression guard for #26847: a pure unsubscribed- account body (no WKE suffix, no OAuth2-validation phrase) must still surface as entitlement and skip refresh. The new disambiguator must not weaken the original loop-protection it was added to preserve. The scaffolding reuses ``_make_codex_agent``, ``_FakePool``, and the existing ``MagicMock`` patterns from the surrounding tests so the new section reads as a natural extension of "Fix D" rather than a separate test file.	2026-05-23 02:48:13 -07:00
xxxigm	8b3cb930c9	fix(xai-oauth): honor [WKE=unauthenticated:...] disambiguator in entitlement classifier (#29344 ) ``_is_entitlement_failure`` over-matched on xAI 403s. xAI returns the same permission-denied ``code`` text for two distinct conditions: 1. Unsubscribed account ("active Grok subscription. Manage at https://grok.com" in the ``error`` field). 2. Stale OAuth access token ("OAuth2 access token could not be validated. [WKE=unauthenticated:bad-credentials]" in the ``error`` field). The classifier's "does not have permission + grok" substring heuristic treated both identically, so the credential-pool refresh path was short-circuited for case (2) — long-running TUI sessions stuck on a stale OAuth token surfaced a non-retryable client error and the user had to exit + reopen the TUI to recover (the startup-resolve path bypasses the classifier entirely, which is why bridge adapters with proactive refresh cadences didn't see this in practice). This patch adopts the reporter's recommended fix (option 1, tightest): honor xAI's explicit ``[WKE=unauthenticated:...]`` suffix and the ``OAuth2 access token could not be validated`` phrasing as authoritative "this is auth, not entitlement" signals. When either appears anywhere in the body's text fields, the classifier returns False eagerly — before the entitlement keyword checks run — so the refresh-on-401 path takes over and the existing loop-protection still guards against runaway refresh storms if the refresh itself fails. Two small adjustments fall out of this: * The haystack now also covers ``code`` and ``error`` keys directly, not just the ``message``/``reason`` shape ``_extract_api_error_context`` produces. Real runtime paths use the normalised shape, but the test suite and any future call sites that pass raw bodies get the same treatment. Backwards compatible: missing keys default to empty strings, the haystack still skips when everything is blank. * Both disambiguator checks fire BEFORE the entitlement keyword checks. If a future xAI body somehow lands with both an entitlement message AND the WKE suffix, the WKE suffix wins (correct — auth is recoverable; entitlement is not, and a refreshed token will surface the entitlement message on the next request anyway). Existing tests (``test_is_entitlement_failure_matches_real_xai_bodies``, ``test_is_entitlement_failure_false_for_unrelated_auth_errors``, ``test_recover_with_credential_pool_skips_refresh_on_entitlement_403``, ``test_recover_with_credential_pool_still_refreshes_genuine_auth_failure``) continue to pass unchanged — the unsubscribed-account path, the generic auth-error path, and the refresh-on-401 path are all left intact.	2026-05-23 02:48:13 -07:00
Teknium	64b3eb0dd7	docs: surface Nous Portal on pages where it solves a real problem the page describes (#30874 ) Follow-up to #30869. Adds Portal mentions on user-facing pages that naturally call for an LLM + tool credentials but didn't previously acknowledge Portal as a one-stop option. - getting-started/installation.md: tip after the 'after install' block pointing at 'hermes setup --portal' for users who want everything wired at once instead of piecewise via 'hermes model' + 'hermes tools'. - user-guide/configuring-models.md: small tip near the top — the page is literally about provider/model choice and previously had zero Portal mention. - user-guide/features/voice-mode.md: Prerequisites need both an LLM and TTS — a Portal subscription is the single setup that covers both. - user-guide/features/batch-processing.md: highlights Portal as a predictable-cost option for parallel agent runs that hit many APIs. - user-guide/features/api-server.md: backend needs models + tools; one Portal sub gives a fully-equipped OpenAI-compatible endpoint. - user-guide/windows-native.md: early-beta users on Windows benefit most from skipping per-tool Windows-key-juggling. - integrations/providers.md: updates the existing Tool Gateway tip and the Nous Portal section to mention the new commands. - user-guide/features/fallback-providers.md: Nous row in the provider table now lists 'hermes setup --portal' as the fresh-install path. Tone discipline: one Portal mention per page, concrete CLI commands (no marketing copy), always solving a problem the page itself sets up.	2026-05-23 02:47:53 -07:00
Teknium	f3fb7899d0	docs: surface 'hermes setup --portal' and 'hermes portal' across user-facing pages (#30869 ) PR #30860 added a one-shot Portal setup command and a small portal CLI surface. Update the docs so the new commands are discoverable without upgrading the tone of existing Portal mentions. - getting-started/quickstart.md: small tip near Choose a Provider pointing at 'hermes setup --portal' as the easiest fresh-install path. - user-guide/features/tool-gateway.md: lead the Get-Started section with 'hermes setup --portal' for fresh installs, keep 'hermes model' for already-configured users, and add 'hermes portal status / tools' to the activity-check commands. - user-guide/features/{web-search,image-generation,tts,browser}.md: the existing 'Nous Subscribers' tip blocks now name the one-shot command for new installs, keeping the existing 'hermes tools' path for users who only want to swap a single backend. - reference/cli-commands.md: register 'hermes portal' in the top-level command table, add a 'hermes portal' section with subcommands, and add '--portal' to the 'hermes setup' options table. Tone: each page already had a Portal mention. This PR keeps the per-page count to one and uses concrete CLI commands rather than promotional copy. Tool Gateway page is the one exception (the whole doc is about Portal).	2026-05-23 02:42:31 -07:00
Teknium	9acf949e34	feat(telegram): edit status messages in place instead of appending (#30864 ) Closes #30045. Based on @qike-ms's PR #30141. Telegram status callbacks (lifecycle, compression, context-pressure) used to append a fresh bubble on every emit. Now adapter tracks {(chat_id, status_key) -> message_id}; first call sends, subsequent calls edit. Failed edits drop the cache entry and fall through to a fresh send. - gateway/platforms/telegram.py: send_or_update_status() (+34 LOC) - gateway/run.py: route _status_callback_sync through it when the adapter supports it; plain adapter.send() otherwise (+15 LOC) - 5 tests covering first send / edit-in-place / edit-failure fallback / distinct key & chat isolation	2026-05-23 02:42:10 -07:00
Teknium	4b6d68bd64	test(fast-command): stub _load_gateway_runtime_config too PR `2362cc468` ("fix(gateway): enforce env variable template expansion on runtime config loaders") refactored `_load_service_tier` to read config via the new `_load_gateway_runtime_config` wrapper instead of opening `_hermes_home/config.yaml` directly. The `test_run_agent_passes_priority_processing_to_gateway_agent` test still only stubbed `_load_gateway_config` (the inner loader), so the runtime wrapper saw an empty config and `_load_service_tier` returned None, breaking the test: FAILED tests/gateway/test_fast_command.py::test_run_agent_passes_priority_processing_to_gateway_agent - AssertionError: assert None == 'priority' Fix: also stub `_load_gateway_runtime_config` to return the expected `agent.service_tier=fast` config, so the test once again drives the priority routing path it was written to verify. Confirmed reproducing on current main before the patch and passing after.	2026-05-23 02:40:33 -07:00
Zyrixtrex	61ac118724	fix(webhook): enforce INSECURE_NO_AUTH safety rail on dynamic route reloads	2026-05-23 02:39:12 -07:00
Teknium	b4cf5b65dd	feat(portal): one-shot setup, status CLI, and Nous-included markers (#30860 ) * feat(portal): one-shot setup, status CLI, and Nous-included markers Four small Portal-aware surfaces that drive subscription value without adding friction for non-Portal users. - hermes setup --portal: one-shot Nous OAuth + provider switch + Tool Gateway opt-in. Shareable as a single command from docs/social. - hermes portal {status,open,tools}: small surface over Portal auth + Tool Gateway routing. Defaults to 'status' when no subcommand. - Tool picker (hermes tools): when the user is logged into Nous, mark Nous-managed provider rows with a star and 'Included with your Nous subscription'. Suppressed when not authed — non-subscribers see the picker unchanged. - BYOK setup hint: a single dim line 'Available through Nous Portal subscription.' appears when the user is being prompted for a paid API key (Firecrawl, FAL, ElevenLabs, Browserbase, etc.) AND the category has a Nous-managed sibling AND the user is not already authed to Nous. Suppressed in all other cases. Tested live end-to-end in an isolated HERMES_HOME with a simulated authed and unauthed user. Targeted suite (tests/hermes_cli/ test_tools_config.py + test_setup.py) passes 97/97. * fix: add portal to _BUILTIN_SUBCOMMANDS so plugin discovery fast-path skips it	2026-05-23 02:39:09 -07:00
Teknium	6942b1836e	fix(skills_guard): explain why --force is rejected on dangerous verdicts Follow-up to @sprmn24's verdict-logic fix. The previous block-message ended in 'Use --force to override' regardless of verdict — but as of the --force fix above, dangerous community/trusted skills can't be overridden by --force at all. The misleading hint sends users in a loop. Replace it with a specific message that tells them what the documented behavior actually is. Adds two regression tests covering the dangerous-verdict message shape and one that pins the existing --force hint for non-dangerous blocks.	2026-05-23 02:37:30 -07:00
sprmn24	789043b691	fix(security): update tests for verdict and --force changes	2026-05-23 02:37:30 -07:00
sprmn24	0f8215f633	fix(security): correct verdict logic and enforce --force limitation in skills_guard - _determine_verdict() returned 'caution' for medium/low-only findings, causing community skills with harmless patterns (e.g. path traversal notation, unpinned pip install) to be incorrectly blocked. Now returns 'safe' when only medium/low severity findings are present. - should_allow_install() allowed --force to override 'dangerous' verdict, contradicting documented behavior that --force does NOT override dangerous scan results. Added explicit check to prevent force-installing skills with dangerous verdict.	2026-05-23 02:37:30 -07:00
Teknium	db489a315f	fix(tests): allowlist tmp_path for kanban_notify artifact delivery (#30852 ) `_deliver_kanban_artifacts` routes candidates through `BasePlatformAdapter.filter_local_delivery_paths` (added in `41d2c758c`), which rejects paths outside `MEDIA_DELIVERY_SAFE_ROOTS`. The two artifact-delivery tests create fixtures under `tmp_path`, which lives outside the cache roots — so under CI's hermetic HOME the filter silently dropped both fake files and the assertions on `images_uploaded` / `documents_uploaded` failed. Fix: monkeypatch `HERMES_MEDIA_ALLOW_DIRS=str(tmp_path)` in both tests so the safety filter accepts the fixtures. Production behaviour unchanged; test-side fix only. CI fail repro on origin/main: test (6) shard, both test_notifier_uploads_artifacts_on_completion and test_notifier_artifact_delivery_skips_missing_files.	2026-05-23 02:34:34 -07:00
xxxigm	5b6f0b695b	test(tls-fd-recycle): pin shutdown-only + thread-aware close contract (#29507 ) Ten regressions across both prongs of the #29507 fix, organised so each test names exactly which way the bug could come back: Prong 1 — ``force_close_tcp_sockets``: * ``shutdown_only_no_close`` is the smoking-gun assertion. If a future refactor adds back ``sock.close()`` to this helper, the FD-recycling race that wrote TLS bytes on top of ``kanban.db`` is back, and this trips. * ``uses_shut_rdwr`` pins that both halves are shut down (a half-close wouldn't unblock a worker stuck in ``recv``). * ``swallows_oserror_on_shutdown`` covers the already-shutdown case. * ``handles_multiple_pool_entries`` walks all pool connections. Prong 2 — thread-aware ``_close_request_client_once``: * ``stranger_thread_aborts_only_no_close`` simulates the asyncio_0 → Thread-1616 interrupt path: stranger drives abort, holder stays populated for the worker's eventual finally. * ``owner_thread_pops_and_full_close`` is the worker-thread path: pops + full close. * ``stranger_then_owner_close_sequence_runs_full_close_exactly_once`` replays the reporter's exact timeline at object level: abort runs once, full close runs once, holder ends empty. Agent surface: * ``_abort_request_openai_client_does_not_call_client_close`` pins that the new entrypoint shuts sockets and emits the ``deferred_close=stranger_thread`` marker but never calls ``client.close()``. * ``_abort_request_openai_client_null_client_is_noop`` defensive. End-to-end: * ``fd_recycle_window_closed_by_shutdown_only`` reproduces the race at object level — runs the abort path from a stranger thread and asserts that no ``close()`` ever fires, so the kernel can never recycle the FD under the owner's still-active reference.	2026-05-23 02:31:10 -07:00
xxxigm	30c22f1158	fix(api-call): defer client.close() to owning worker thread on interrupt (#29507 ) Layer-2 defense for the FD-recycling race: even with ``force_close_tcp_sockets`` reduced to shutdown-only, the followup ``client.close()`` in ``_close_openai_client`` still walks the httpx pool and closes sockets — and if called from a stranger thread (the interrupt-check loop, the stale-call detector) it has the same FD-recycling exposure that wrote a TLS record on top of ``kanban.db``. Stamp the request_client_holder with the owning thread's ident at ``_set_request_client`` time. In ``_close_request_client_once``: * Owning thread (the worker's ``finally``) → pop + ``client.close()`` via ``_close_request_openai_client``, exactly as before. * Stranger thread → ``_abort_request_openai_client`` (new): only ``shutdown(SHUT_RDWR)`` the pool sockets and log a deferred-close marker. The holder stays populated so the worker's eventual ``finally`` performs the real close from its own thread context, where the FD release races nothing. Applied symmetrically to both the non-streaming ``interruptible_api_call`` and the streaming variant — both routinely get hit by stranger-thread interrupts. The log field ``tcp_force_closed=N`` keeps its existing shape; the new abort path adds ``deferred_close=stranger_thread`` so production triage can distinguish the two close kinds.	2026-05-23 02:31:10 -07:00
xxxigm	e2a7d73a66	fix(force_close_tcp_sockets): shutdown only, do not release FD (#29507 ) The helper used to call ``socket.shutdown(SHUT_RDWR)`` followed by ``socket.close()`` to drop CLOSE-WAIT entries immediately. On its own ``shutdown()`` is safe from any thread — it only sends FIN and breaks pending ``recv``/``send`` — but ``close()`` releases the FD integer to the kernel. When the helper runs on a stranger thread (the interrupt loop, the stale-call detector) the FD release races the owning httpx worker thread that still has the same integer cached inside the SSL BIO. The kernel then recycles that integer to the next ``open()`` call — in production, kanban dispatcher's ``kanban.db`` — and the worker's delayed TLS flush writes a 24-byte TLS application-data record on top of the SQLite header. Restrict the helper to ``shutdown(SHUT_RDWR)`` only. The owning httpx worker's own unwind will close the underlying socket via the same Python ``socket.socket`` object, which atomically swaps ``_fd`` to -1 before issuing ``close(2)`` — no FD-aliasing window. The log field ``tcp_force_closed=N`` is kept (now counts shutdowns) so existing dashboards / log parsers keep working.	2026-05-23 02:31:10 -07:00
sprmn24	53cb6d32be	fix(agent): use atomic_json_write for request debug dumps instead of bare write_text	2026-05-23 02:30:57 -07:00
sprmn24	b183be95a2	fix(gateway-windows): atomic write for .cmd and startup launcher scripts	2026-05-23 02:30:41 -07:00
walli	60b0a0e006	fix(qqbot): fix SILK magic byte detection slice length _guess_ext_from_data: data[:5] == b"#!SILK" -> data[:6] (6-byte string) _looks_like_silk: data[:4] == b"#!SILK" -> data[:6] The previous slices were too short to ever match the 6-byte "#!SILK" literal, relying entirely on the "#!SILK_V3" (9-byte) and 0x02! (2-byte) fallback paths for SILK format detection.	2026-05-23 02:27:17 -07:00
walli	0e7448d63a	fix(qqbot): use original attachment filename for cached files Add original_name parameter to _download_and_cache, preferring the attachment metadata filename over the CDN URL path basename. Previously files were cached with meaningless QQ CDN hash names (e.g. qqdownload_...oadftnv5), causing ugly filenames when sent back to users. Aligns with qqbot-agent-sdk's AttachmentDownloader.download_document.	2026-05-23 02:27:17 -07:00
walli	a54f5afc70	fix(qqbot): handle op 7/9 and expand fatal close code set 1. Handle op 7 (Server Reconnect): close WS to trigger reconnect loop while preserving session for Resume 2. Handle op 9 (Invalid Session): check d value to determine if session is resumable; clear session only when not resumable 3. Remove 4009 from session-clearing set (connection timeout is resumable) 4. Expand fatal close codes: 4001/4002/4010-4014 now stop reconnect immediately instead of retrying uselessly 5. Add unit tests	2026-05-23 02:27:17 -07:00
walli	bbd77d165c	fix(qqbot): add INTERACTION intent and expose video/file cached paths 1. Add INTERACTION intent bit (1<<26) to _send_identify, fixing approval button clicks not being received (INTERACTION_CREATE events were never dispatched by the gateway) 2. Include local cached path in video/file attachment descriptions so the LLM can reference files for re-sending to users 3. Add unit tests (TestIdentifyIntents, TestProcessAttachmentsPathExposure)	2026-05-23 02:27:17 -07:00
teknium1	66d81f9e14	fix(gateway): don't swallow expansion errors in runtime config helper A bare except in _load_gateway_runtime_config would silently return the unexpanded dict on any _expand_env_vars failure — masking the very bug this helper exists to fix. Drop it; let the caller see real errors.	2026-05-23 02:27:08 -07:00
QuenVix	2362cc4688	fix(gateway): enforce env variable template expansion on runtime config loaders	2026-05-23 02:27:08 -07:00
QuenVix	d21ac579e9	fix(gateway): honor key_env in auth-failure fallback resolution	2026-05-23 02:25:53 -07:00
Teknium	99671a8634	test(kanban): allow tmp_path artifacts past media-delivery validator PR #`41d2c758c` ("Fix unsafe gateway media path delivery") tightened `validate_media_delivery_path` so that artifacts emitted by the agent must live inside `MEDIA_DELIVERY_SAFE_ROOTS` (Hermes-managed cache dirs) or an operator-allowlisted root via `HERMES_MEDIA_ALLOW_DIRS`. Two kanban-notifier tests put their PDFs and PNGs under pytest's `tmp_path`, which is correctly rejected by the new validator. They started failing on main as soon as that PR landed: FAILED tests/hermes_cli/test_kanban_notify.py::test_notifier_uploads_artifacts_on_completion FAILED tests/hermes_cli/test_kanban_notify.py::test_notifier_artifact_delivery_skips_missing_files Symptom in logs: "Skipping unsafe local file path outside allowed roots". The validator is doing exactly what it should — the tests were relying on the looser pre-fix behaviour. Fix: add `HERMES_MEDIA_ALLOW_DIRS=tmp_path` to the `kanban_home` fixture so artifacts under `tmp_path` are recognised as safe. This is the same allowlist mechanism the operator-facing env var documents.	2026-05-23 02:25:09 -07:00
Teknium	5772e638c9	chore: drop in-repo infographic/ directory; keep PR-body URLs only (#30854 ) PR infographics belong in PR descriptions, not committed to the repo. Removes the 13 archived directories under infographic/ and adds the path to .gitignore so future generations don't accidentally land in-tree. The fal.media URLs embedded in each PR's body remain the canonical artifact — those PR descriptions are the storage.	2026-05-23 02:25:03 -07:00
sprmn24	b2e6fdd3bf	fix(agent): log warning when fallback model normalization fails instead of silently swallowing	2026-05-23 02:23:24 -07:00
teknium1	70aaa774be	fix(opencode-go): emit Kimi reasoning_effort, match KimiProfile shape The Kimi K2 branch added in the prior commit only emitted extra_body.thinking and dropped reasoning_effort entirely. KimiProfile (api.moonshot.ai/v1) sends both fields, and OpenCode Go proxies to the same Moonshot backend. Mirror that shape on the Go path so /reasoning effort actually reaches Kimi. - low/medium/high pass through verbatim - xhigh/max clamp to high (Moonshot's max supported value) - minimal / unknown effort → omit reasoning_effort, keep thinking on - disabled / no config → unchanged - DeepSeek branch unchanged	2026-05-23 02:20:28 -07:00
Harish Kukreja	3589960e03	fix(provider): expose OpenCode Go reasoning controls	2026-05-23 02:20:28 -07:00
helix4u	71291d83cd	test: keep tirith checks hermetic	2026-05-23 02:20:14 -07:00
QuenVix	52a368fa72	fix(gateway): preserve WhatsApp pairing approvals across JID/LID alias flips	2026-05-23 01:46:34 -07:00
Teknium	3127a41cb1	test(acp): pin parse_model_input in slash-command tests The two ACP slash-command tests that exercise `provider:model` routing (`test_set_session_model_accepts_provider_prefixed_choice` and `test_model_switch_uses_requested_provider`) relied on the live `hermes_cli.models._KNOWN_PROVIDER_NAMES` / `_PROVIDER_ALIASES` module state to parse `anthropic:claude-sonnet-4-6` into `("anthropic", "claude-sonnet-4-6")`. If any earlier test in the same xdist worker registers a custom provider that shadows `anthropic` or otherwise mutates those globals, the parser falls into the `detect_provider_for_model` branch and resolves to `custom` instead. Observed once in CI on run 26326728502 / job 77505732299 as `AssertionError: assert 'custom' == 'anthropic'` — could not reproduce locally under per-file isolation, so the failing in-file order was specific to a particular xdist scheduling. Monkeypatching `parse_model_input` + `detect_provider_for_model` for both tests removes the global-catalog dependency, so the tests now only exercise what they were written to verify (the `requested_provider -> runtime -> AIAgent kwargs` plumbing).	2026-05-23 01:44:56 -07:00
xxxigm	6a2df9f451	docs(env): clarify HERMES_ENABLE_PROJECT_PLUGINS contract (#29156 ) The reference entry now documents the truthy set (``1`` / ``true`` / ``yes`` / ``on``) explicitly, matches the falsy half (``0`` / ``false`` / ``no`` / ``off`` / empty string) that the GHSA-5qr3-c538-wm9j fix re-aligned both the agent loader and the dashboard web server around, and points readers at the defence-in-depth rule that project plugins never have their Python ``api`` file auto-imported by the dashboard regardless of the env var.	2026-05-23 01:43:52 -07:00
xxxigm	8bf99227f0	fix(plugins): block plugin-api path traversal + project RCE (#29156 ) GHSA-5qr3-c538-wm9j — half two of the bypass chain. ``_mount_plugin_api_routes`` imports each dashboard plugin's manifest ``api`` field as a Python module via ``importlib.util.spec_from_file_location`` — arbitrary code execution by design. Two primitives in the surrounding code turned that "by design" RCE into a usable attack: 1. Absolute paths in the manifest swallow the plugin directory. ``Path('safe/dashboard') / '/tmp/evil.py'`` resolves to ``/tmp/evil.py``, so a single manifest line ``{"api": "/tmp/payload.py"}`` was enough to redirect the importer at any Python file on disk. 2. ``..`` traversal in the manifest climbs out of the dashboard directory. ``Path('plugins/safe/dashboard') / '../../../tmp/evil.py'`` lands in ``/tmp/evil.py`` after ``resolve()`` — the static-asset handler (``serve_plugin_asset``) already defends against this via ``is_relative_to``; the api-mount path didn't. Fix at three layers so a regression in any one can't re-open the advisory: * New ``_safe_plugin_api_relpath`` validator runs at discovery time and stores only sanitised relative paths on the plugin entry's ``_api_file`` field. Absolute paths, ``..`` traversal, empty / non-string values, and paths that ``resolve()`` outside the plugin's ``dashboard/`` directory are rejected with a warning naming the plugin. ``has_api`` follows the sanitised value so the dashboard frontend doesn't render a fake "Backend API" badge for plugins whose api was scrubbed. * ``_mount_plugin_api_routes`` re-validates the resolved path against the live filesystem just before the import — defence in depth in case ``_dir`` is tampered with post-cache or a future caller bypasses the discovery-time validator. * Project plugins (``source == "project"``) are refused outright for backend import. ``./.hermes/plugins/`` ships with the CWD, so any threat model that includes "user opens a malicious repo" treats it as attacker-controlled; project plugins can still extend the UI via static JS/CSS but their Python ``api`` is no longer auto-imported. Combined with the truthy env-gate fix from the previous commit, the original advisory chain now fails at two distinct choke points.	2026-05-23 01:43:52 -07:00
xxxigm	da636e982b	test(plugins): regression coverage for project-plugin RCE chain (#29156 ) 35 new tests across 5 classes covering every layer of the GHSA-5qr3-c538-wm9j defence. Each class corresponds to one chokepoint so a regression in any single layer is caught by the named class: * ``TestProjectPluginsEnvGate`` (13 cases) — parametrised over both the documented truthy values (``1`` / ``true`` / ``yes`` / ``on`` + uppercase variants) and the previously-bypassing falsy strings (``0`` / ``false`` / ``no`` / ``off`` / ``""`` / ``False``). The falsy half is the direct env-bypass repro: pre-fix any non-empty string enabled the project source. * ``TestApiPathSanitizer`` (16 cases) — unit-level coverage of the new ``_safe_plugin_api_relpath`` helper. Absolute paths (``/etc/passwd``, ``/tmp/payload.py``, ``/usr/bin/python``), ``..``-traversal payloads (including nested ``subdir/../../..``), and non-string / empty / whitespace-only values must all return ``None``. Safe relative paths (``api.py``, ``backend/routes.py``) round-trip unchanged so legitimate plugins keep working. * ``TestDiscoveryScrubsApiField`` (3 cases) — end-to-end through ``_discover_dashboard_plugins`` with a real manifest on disk. Verifies that the cached plugin entry's ``_api_file`` is scrubbed at discovery time (``None`` + ``has_api: False``) so any downstream consumer can't be tricked into re-deriving the unsafe path from cache. * ``TestMountApiRoutesRefusesUntrusted`` (3 cases) — pokes synthetic plugin entries with each refusal vector directly into the cache and patches ``importlib.util.spec_from_file_location`` to assert it is not invoked for project-source / traversal payloads, and is invoked normally for bundled / user plugins. * ``TestEndToEndPocBlocked`` (1 case) — reproduces the original advisory PoC: operator sets ``HERMES_ENABLE_PROJECT_PLUGINS=0`` believing project plugins are off, attacker plants a manifest in CWD's ``.hermes/plugins/`` with ``api`` pointing at an absolute payload path. Asserts that the importer is never called against the payload path and that ``hermes_dashboard_plugin_evil`` is not in ``sys.modules`` after the mount routine runs. An autouse fixture busts ``_dashboard_plugins_cache`` before and after each test so the production cache (populated by the import-time ``_mount_plugin_api_routes()`` call) can't bleed in. All 12 pre-existing dashboard-plugin tests in ``test_web_server.py`` still pass unchanged.	2026-05-23 01:43:52 -07:00
xxxigm	09f85f2cf7	fix(plugins): apply truthy env semantics to project-plugin gate (#29156 ) GHSA-5qr3-c538-wm9j — half one of the bypass chain. ``_discover_dashboard_plugins`` opted into the untrusted ``./.hermes/ plugins/`` source via ``if os.environ.get("HERMES_ENABLE_PROJECT_ PLUGINS"):`` — which is True for any non-empty string. ``=0``, ``=false``, ``=no``, ``=off`` all return non-empty strings and so enabled the project source even though every operator (and the agent loader, ``hermes_cli/plugins.py`` line 815) reads those values as "disabled". An attacker who can land a manifest under the CWD's ``.hermes/plugins/`` directory — a malicious cloned repo, a worktree checked out from a forked PR, a CI runner workspace — was therefore guaranteed to get their manifest discovered the moment the user ran ``hermes dashboard`` from that directory, regardless of whether the user thought they had project plugins disabled. Switch to the shared ``utils.env_var_enabled`` helper used by the agent loader so the gate accepts the documented truthy set (``1`` / ``true`` / ``yes`` / ``on``, case-insensitive) and treats everything else — including ``0`` / ``false`` / ``no`` — as off. Half two (path-traversal + project-source ``api`` import) lands in the next commit. Together they break the RCE chain at two distinct choke points so a future regression in either one alone can't re-open the advisory.	2026-05-23 01:43:52 -07:00
Teknium	11e6dd3c60	chore(release): add AUTHOR_MAP entry for egilewski (PR #30432 ) (#30833 )	2026-05-23 01:41:31 -07:00
Eugeniusz Gilewski	41d2c758c3	Fix unsafe gateway media path delivery	2026-05-23 01:40:35 -07:00
Markus	4a91e36495	fix(gateway): separate observed Telegram group context	2026-05-23 01:33:42 -07:00
Teknium	729a778af0	infographic: PR #17659 read-deny credentials salvage Some checks failed Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Docker Build and Publish / move-latest (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Has been cancelled Details Nix Lockfile Fix / fix (push) Has been cancelled Details	2026-05-22 20:15:09 -07:00
Teknium	97e975edd2	fix(file-safety): widen read-deny to .env, mcp-tokens/, webhook secrets, root Extends @briandevans's PR #17659 from {auth.json, auth.lock, .anthropic_oauth.json} to also cover: - HERMES_HOME/.env (provider API keys) - HERMES_HOME/webhook_subscriptions.json (per-route HMAC secrets) - HERMES_HOME/mcp-tokens/ (OAuth token directory; dir + everything inside) …AND iterates over both _hermes_home_path() AND _hermes_root_path() so profile-mode runs (HERMES_HOME = <root>/profiles/<name>) also block <root>/{auth.json, .env, mcp-tokens/, ...}. Same widening shape as the write-deny side already does (#15981, #14157). Explicitly NOT a security boundary. Per the personal-assistant trust model, the terminal tool runs as the same OS user and can `cat auth.json` directly. This read-deny exists as defense-in-depth: - Models that respect tool denials empirically tend to stop rather than reach for the shell. - The denial surfaces an audit trail when something tries to read credentials — easier to spot in logs than a generic `cat`. Docstring + error message both flag this as defense-in-depth so future contributors don't mistake it for a real security boundary and don't re-decline reports that propose the same fix shape. Absorbs the .env and mcp-tokens/ coverage from @tomqiaozc's parallel PR #8055 (closed-as-duplicate, credited). Co-authored-by: Tom Qiao <zqiao@microsoft.com>	2026-05-22 20:15:09 -07:00
briandevans	567ea61298	fix(file-safety): block auth.json read via TERMINAL_CWD relative path read_file_tool resolves relative paths against TERMINAL_CWD (or the task's live terminal cwd), but the prior call passed the original unresolved string to get_read_block_error. That function's own resolve() is anchored at the Python process cwd, so when a task's TERMINAL_CWD pointed at HERMES_HOME and the agent issued read_file on the relative path "auth.json", the credential-store denylist was never reached and the file was read normally. Pass the already-resolved absolute path string at the file_tools call site, document the contract on get_read_block_error, and add a read_file_tool-level regression test that pins the relative-path case under TERMINAL_CWD == HERMES_HOME. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 20:15:09 -07:00
briandevans	056e00a77e	fix(file-safety): block read_file on HERMES_HOME credential stores (#17656 ) `get_read_block_error` previously only denied reads inside `${HERMES_HOME}/skills/.hub`, which left `auth.json` (provider OAuth state + plaintext API keys) and `.anthropic_oauth.json` (Anthropic PKCE tokens) directly readable by the agent. A prompt-injection reaching `read_file` could exfiltrate active provider credentials in plaintext. Mode-0600 file permissions only protect against other Unix users — the agent runs as the file's owner, so `read_file` is unaffected. Extend the existing deny list with the three credential paths identified in #17656 (`auth.json`, `auth.lock`, `.anthropic_oauth.json`). The check uses the same `Path.resolve()` pattern as `skills/.hub`, so symlink/path-traversal indirection is caught too. The agent doesn't need to read these directly — `auxiliary_client` and `credential_pool` consume them through process env / OAuth flows that bypass `read_file`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 20:15:09 -07:00
Teknium	7f7245bf62	infographic: PR #6656 skill hub safety audit salvage	2026-05-22 19:59:24 -07:00
Teknium	3f78d8073c	fix(skills): make content_hash filename-sensitive too (symmetric with bundle_content_hash) PR #6656 added rel_path + \x00 prefixing to ``bundle_content_hash`` so a filename swap between two files in a bundle changes the digest. But it only patched the in-memory side — ``content_hash`` in ``tools/skills_guard.py`` (the on-disk equivalent) still hashed file contents only. These two functions need to stay symmetric: ``check_for_skill_updates`` compares the disk hash of an installed skill against the bundle hash of the upstream copy. With the asymmetric fix, every clean install showed as drifted because the digests no longer matched (2 existing tests in ``test_skills_hub.py`` started failing as soon as the contributor's change landed). Apply the same ``rel_path + \x00 + content`` shape to the disk-side function. Both functions now produce the same digest for the same skill content laid out two ways. Documented the symmetry invariant in the docstring so a future change to either function knows to touch both. Also adds tests/tools/test_pr_6656_regressions.py with 10 regression tests covering all three fixes salvaged in PR #6656: - uninstall_skill path traversal (4 cases: parent segments, absolute paths, symlink escape, legitimate skill) - bundle_content_hash filename swap detection (4 cases: in-memory swap, identity, disk-side swap, bundle↔disk symmetry) - list_pending lock contract (2 cases: source-grep contract, smoke) Also fixes AUTHOR_MAP entry for @aaronlab — their commit email (1115117931@qq.com) maps to "aaronagent" which isn't a real GitHub login, so changelog @mentions would 404.	2026-05-22 19:59:24 -07:00
aaronagent	b82608a6f5	fix(skills,pairing): path traversal guard in uninstall, lock list_pending, hash file paths - skills_hub: validate that uninstall_skill's install_path resolves inside SKILLS_DIR before calling shutil.rmtree, preventing recursive deletion of arbitrary directories via poisoned lock.json entries - skills_hub: include file paths (not just contents) in bundle_content_hash so swapping filenames between files changes the hash, strengthening update-detection integrity - pairing: wrap list_pending() in self._lock so _cleanup_expired() file writes don't race with concurrent generate_code()/approve_code() calls Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-22 19:59:24 -07:00
teknium1	8cf977c8b1	fix(plugins): widen _sanitize_plugin_name for category-namespaced names Follow-up to PR #28832 — the dashboard plugin routes now accept slashed names like `observability/langfuse` and `image_gen/openai`, but `_sanitize_plugin_name` still rejected forward slash and so dashboard update + remove on those plugins fell through to '404 not found' even though they exist on disk. Adds an opt-in `allow_subdir=True` flag that: - Permits internal forward slashes (category-namespaced plugin keys emitted by `_discover_all_plugins`). - Strips leading and trailing slashes. - Still rejects `..` and backslash, and still asserts the resolved target lives inside `plugins_dir`. Opted in at the two read-paths that operate on installed plugins: `_require_installed_plugin` (CLI update/remove) and `_user_installed_plugin_dir` (dashboard update/remove). The install path keeps the default (`allow_subdir=False`) because freshly-cloned plugins always land top-level under `~/.hermes/plugins/<name>/`. Adds 6 targeted unit tests covering the new flag's allow/reject matrix.	2026-05-22 19:50:32 -07:00
Austin Pickett	487c398dcf	refactor(web): dashboard typography & contrast pass Removes the global `uppercase` + `font-mondwest` from the App.tsx root that forced every page to opt-out, replaces stacked-alpha text colors with semantic tokens for WCAG-AA contrast across all 7 themes, and applies the new `text-display` utility from @nous-research/ui@0.16.0 on intentional brand chrome (page titles, sidebar headings, segmented filters) only. Bumps every sub-12px arbitrary text size to text-xs. Also widens the dashboard plugin routes (/api/dashboard/agent-plugins/ {name:path}/...) so category-namespaced plugins like observability/ langfuse and image_gen/openai can be enable/disabled from the dashboard — previously the FE encodeURIComponent-ed the slash and the backend {name} route rejected it. _validate_plugin_name still blocks .. and backslash, and strips leading/trailing slash. Touches sessions/env/keys page chrome and adds two new i18n keys (`overview`, `showMore`/`showLess`) across all 18 locales. Squashes 19 commits from PR #28832. Co-authored-by: Hermes <noreply@nousresearch.com>	2026-05-22 19:50:32 -07:00
ethernet	dc4b0465b5	feat(ci): use 6-way slicing based on benchmark results Benchmarked 4/5/6/7/8 slices with LPT duration-balanced distribution: - 4 slices: 4.8m wall, 135s spread - 5 slices: 3.4m wall, 46s spread - 6 slices: 3.3m wall, 26s spread ← optimal - 7 slices: 3.9m wall, 109s spread - 8 slices: 3.7m wall, 96s spread 6 slices is the sweet spot: lowest wall time, tightest spread. 7+ gets slower due to per-slice startup overhead dominating. Also removes benchmark branch markers from save-durations condition.	2026-05-22 19:46:18 -07:00

1 2 3 4 5 ...

9270 commits