Commit graph

5459 commits

Author SHA1 Message Date
xxxigm
8f4a718f95 test(discord): guard slash-command registration against the 100 cap
Registers 200 plugin commands on top of the native + COMMAND_REGISTRY set
and asserts the tree never exceeds Discord's 100-command limit, that native
high-priority commands survive the cap, and that overflow is actually
dropped. Regression guard for the recurring error 30032
("Maximum number of application commands reached") sync failures.
2026-06-14 17:02:21 +07:00
Teknium
afc8615509
perf(webhook): prune request caches incrementally (#46065) 2026-06-14 02:40:54 -07:00
LeonSGP43
89bdb1e546 fix: read dashboard spa assets as utf-8
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-14 02:31:04 -07:00
Teknium
7b9dc7cd0a test(gateway): align web profile wrapper expectation 2026-06-14 02:20:55 -07:00
helix4u
d76a58bd15 fix(gateway): resolve sudo profile system installs 2026-06-14 02:20:55 -07:00
Teknium
1f5eef8093 test(tui): tolerate resume init kwargs in protocol tests 2026-06-14 02:15:33 -07:00
Teknium
9f33d673e9 fix(tui): persist resumed profile cwd updates to profile db 2026-06-14 02:15:33 -07:00
dsad
d842155da1 Keep resumed profile cwd scoped to profile DB 2026-06-14 02:15:33 -07:00
helix4u
4936a49a0c fix(mcp): preserve loop during probes
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run
Typecheck / typecheck (apps/desktop) (push) Waiting to run
Typecheck / typecheck (apps/shared) (push) Waiting to run
Typecheck / typecheck (ui-tui) (push) Waiting to run
Typecheck / typecheck (web) (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
Nix Lockfile Fix / auto-fix-main (push) Has been cancelled
Nix Lockfile Fix / fix (push) Has been cancelled
Build Skills Index / build-index (push) Has been cancelled
Build Skills Index / trigger-deploy (push) Has been cancelled
2026-06-14 02:09:45 -07:00
helix4u
85e6232a07 fix(providers): support anthropic proxy v1 endpoints 2026-06-14 02:09:16 -07:00
Teknium
81e42335a1
fix(file-safety): relax user-write deny policy (#45947)
Allow file tools to edit shell startup files, user package-manager configs, and Hermes control files that the user can already modify directly. Keep hard blocks for SSH keys, .env/OAuth token stores, mcp-tokens, pairing files, and system privilege files.
2026-06-14 02:07:32 -07:00
Brooklyn Nicholson
715b691723 fix(desktop): show summarizing indicator during auto-compaction
Auto-compression rewrites history mid-turn, which made long threads look
like they reset. Re-tag the gateway lifecycle status as compacting and
surface it in the desktop thread loading indicators.
2026-06-14 02:28:07 -05:00
kshitijk4poor
12c84d6c77 fix(transports): only treat a refusal as terminal when it is the sole payload
A chat-completions response that carries real text or tool calls *alongside*
a `message.refusal` note is a normal, usable turn — the model did work. The
prior logic flipped finish_reason to `content_filter` whenever a refusal
string was present, so the conversation loop reframed a content-bearing turn
as a *failed* safety refusal (failed=True) and buried the model's actual
output inside the "model declined" template, or dropped tool calls entirely.

Only promote to a terminal `content_filter` when the refusal is the sole
payload (no visible text AND no tool calls). The refusal explanation is still
recorded in provider_data in every case for observability. Refusal-only
responses (the bug this feature targets) are unaffected and still surface
terminally; the empty+refusal, bare content_filter passthrough, and no-refusal
common cases are byte-identical to before.

Updates the partial-content test to the corrected contract and adds a
tool_calls-alongside-refusal regression guard.
2026-06-14 12:12:52 +05:30
SHL0MS
ab26541b9a test(transports): lock in content_filter passthrough for OpenRouter
OpenRouter (and every other OpenAI-compatible provider) uses the default
chat_completions transport, so it is already covered by the refusal fix:
an upstream Claude / moderation refusal arrives as
finish_reason="content_filter" (often empty content, no message.refusal).
Add a regression test asserting the transport passes that finish reason
straight through to the loop's content_filter handler.

(cherry picked from commit 60168a513b)
2026-06-14 12:10:08 +05:30
SHL0MS
bb46bf8ce4 fix(agent): surface model refusals instead of retrying them as errors
A Claude refusal (HTTP 200, stop_reason="refusal", empty content) was
laundered into a generic retry loop and surfaced as a misleading
"rate limited / invalid response" or "no content after retries" error,
burning paid attempts reproducing a deterministic refusal.

This hit two distinct paths:

- Direct Anthropic (anthropic_messages): validate_response rejected the
  empty-content refusal *before* normalize_response mapped refusal ->
  content_filter, so it fell into the invalid-response retry loop.
- Nous Portal / OpenAI-compatible (chat_completions): the portal surfaces
  a Claude refusal via message.refusal with empty content, which sailed
  past validation and died in the empty-response retry loop.

Fix (one unified content_filter dispatch for all backends):
- AnthropicTransport.validate_response: accept empty content when
  stop_reason == "refusal" so it flows to normalize_response.
- ChatCompletionsTransport.normalize_response: promote message.refusal to
  content + a content_filter finish reason.
- conversation_loop: handle finish_reason == "content_filter" - fire the
  api_request_error hook (content_policy_blocked), try a configured
  fallback once, else return a clear terminal refusal message. Never retry
  a deterministic refusal.

Supersedes #43084, which fixed only the direct-Anthropic path and could
not reach the chat_completions/portal path.

Tests: transport-level (validate_response refusal, message.refusal
promotion) + end-to-end loop (refusal surfaced, exactly one API call).

(cherry picked from commit 01f546f92c)
2026-06-14 12:10:08 +05:30
brooklyn!
4b5ba112ad
fix: shrink images to reported provider dimension limit (#45979)
Parse provider-reported image pixel ceilings so many-image Anthropic requests can recover by shrinking Retina screenshots below the stricter limit instead of retrying the same rejected payload.
2026-06-14 01:07:43 -05:00
Teknium
8f278403d1
perf(execute-code): stop waiting on idle RPC accept (#45948) 2026-06-13 21:57:15 -07:00
Teknium
1b16c48170 fix: guard OAuth account removal 2026-06-13 21:47:13 -07:00
Justin Sunseri
12682d96b9 feat(telegram): restore rich messages opt-out
Salvages PR #45840's client-compatibility opt-out while keeping rich messages enabled by default via telegram.extra.rich_messages: true.
2026-06-13 21:45:49 -07:00
aimable100
8d5d36d793
fix(dispatch): forward session_id into registry.dispatch (#28479)
Both the regular and execute_code dispatch paths forward task_id into
registry.dispatch via middleware _dispatch lambdas but silently dropped
session_id. Dispatch-layer hooks (e.g. set_enforcement_fn) that correlate
calls with the active session received "" for every invocation.

Pass session_id=session_id at both _dispatch call sites inside
handle_function_call, matching the existing task_id pattern. Hooks
already received session_id; this closes the registry.dispatch gap.

Rebased onto current main where dispatch is wrapped by
run_tool_execution_middleware — the old direct-dispatch sites from
#28479 no longer exist.

test(dispatch): add tests for session_id forwarding (NousResearch#28479)

Covers standard and execute_code paths through the middleware wrapper.
Verifies task_id forwarding is not broken by the change.
2026-06-14 00:27:59 -04:00
Teknium
7aaae7acd0 fix(ssl): align guard docs and escape hatch 2026-06-13 21:14:32 -07:00
Teknium
af5b526472 fix(ssl): validate CA bundle paths before provider calls 2026-06-13 21:14:32 -07:00
chromalinx
b42c5bf652 test(ssl_guard): fix macOS fallback test that passed for the wrong reason
The previous test patched ssl.create_default_context globally with a bare
SSLContext that has zero CA certs. Both verify_ca_bundle() and the macOS
fallback got the same mocked context, so the test verified nothing useful:
both paths produced empty get_ca_certs() and the assertion that no
exception escaped was vacuously satisfied.

Only mock the fallback call (no cafile) — let the certifi call hit the
real SSL stack and fail with SSLError on the broken PEM. The mock
fallback returns a context with load_default_certs() so the test now
verifies the real scenario: broken certifi → SSLConfigurationError,
macOS system trust store → success.

Also pads the broken PEM past the 1 KB size guard so the size check
doesn't short-circuit before ssl.create_default_context(cafile=...) runs.

Reported by @liuhao1024 in PR review.
2026-06-13 21:14:32 -07:00
chromalinx
a218a0f156 fix(agent,gateway,doctor): add SSL CA cert bundle fail-fast guard
A stale certifi CA bundle after a partial `hermes update` used to crash
the agent on the first outbound HTTPS call with a raw traceback and
trap the gateway in a retry loop.

This patch:

* Adds `agent/errors.py` with a typed `SSLConfigurationError`
* Adds `agent/ssl_guard.py` with a `verify_ca_bundle()` pre-flight
  that asserts the bundle exists, is non-trivial in size, and can build
  a working SSLContext. On macOS, it falls back to the system trust
  store when the bundle is empty but the system store is healthy
  (covers corporate proxies / MDM setups).
* Wires the guard into `run_agent.py` and `gateway/run.py` right
  after the `hermes_bootstrap` import, inside a try/except so a bug
  in the guard itself can never prevent startup.
* Adds a `SSL / CA Certificates` section to `hermes_cli doctor` so
  users can detect the failure with one command.
* Adds unit tests covering the healthy, missing, empty, skip-env, and
  macOS-fallback paths.
* Adds an RCA document describing the failure mode and the recovery
  path (`pip install -e .`).

When the bundle is broken the user sees:

    \u26a0\ufe0f SSL certificate bundle issue detected.
       Run: pip install -e .

`HERMES_SKIP_SSL_GUARD=1` disables the check for sandboxed
environments that ship their own trust store.
2026-06-13 21:14:32 -07:00
Teknium
1106879147
perf(process): wake waiters on background completion (#45831) 2026-06-13 21:11:19 -07:00
Max Pollard
9a2b976326 test(skills): add regression tests for bundled-update backup recovery
Three tests covering: a stale .bak poisoning a failed update's move/restore, an orphaned .bak misread as a user deletion, and a partially written dest blocking restore-on-failure. All three fail on current main without the fix.

Refs #44942
2026-06-13 15:01:42 -07:00
Teknium
bf8effad02
fix(utils): copy fallback for atomic replace across devices (#43852)
Fallback from `os.replace` on EXDEV/EBUSY using copy+fsync+unlink while preserving symlink target semantics and metadata.
2026-06-13 14:50:05 -07:00
Teknium
817f392311
feat(read): extract notebook and office documents (#37082)
Add stdlib-only extraction for `.ipynb`, `.docx`, and `.xlsx` in read_file with lazy integration and malformed-document fallback.
2026-06-13 14:42:51 -07:00
Teknium
2b67e96aec fix(approval): gate in-place edits to sensitive user files
Cover sed, perl, and ruby in-place mutations against shell rc, SSH, and credential files so terminal approvals pair the redirection and copy guards.
2026-06-13 14:35:27 -07:00
helix4u
abd69b8117 fix(approval): detect absolute home shell rc writes 2026-06-13 14:35:27 -07:00
briandevans
da28d5d113 fix(security): gate cp/mv/install into ~/.ssh, credential, and shell-rc files
tools/approval.py already denies tee/redirection writes to every
_SENSITIVE_WRITE_TARGET (~/.ssh/*, ~/.netrc/.pgpass/.npmrc/.pypirc, shell
rc files, ~/.hermes/config.yaml/.env) via the DANGEROUS_PATTERNS tee/`>`
rules, but cp/mv/install were only paired for _SYSTEM_CONFIG_PATH (/etc) and
the project-relative env/config target. So `cp evil ~/.ssh/authorized_keys`
(SSH-key implant / persistence), `cp creds ~/.netrc`, and `cp evil ~/.bashrc`
(login-time command injection) auto-approved while the equivalent tee/`>`
forms were denied — an unpaired write deny is theater (same rationale as
#14639 / commit 4e9d886d, which paired the terminal side for
~/.hermes/config.yaml writes but did not touch these cp/mv/install verbs on
the broader sensitive set).

Add one (cp|mv|install) DANGEROUS_PATTERNS entry reusing the existing
_SENSITIVE_WRITE_TARGET fragment, anchored via _COMMAND_TAIL so it fires on
the destination (last arg) only: reading OUT of a sensitive path
(`cp ~/.ssh/config /tmp/x`) stays auto-approved. Description differs from the
system-config cp entry so the two keep distinct approval keys (no silent
cross-approval). Additive — does not subsume the /etc or project-config rules.

Adds TestSensitiveCopyMovePattern: 5 positive cases (ssh authorized_keys,
ssh private key via mv, netrc via install, bashrc, ~/.hermes/config.yaml) +
2 negative guards (copy FROM ssh, unrelated copy). The ssh/netrc/bashrc
positives fail on main and pass on this branch; the negatives stay green
both ways.
2026-06-13 14:35:27 -07:00
Teknium
1fa761f8de
fix(search): keep partial results on search timeout (#36142)
Treat search command budget timeouts as soft truncation so partial results survive, while real search failures still return structured errors.
2026-06-13 14:35:21 -07:00
briandevans
1d584a301e fix(agent): treat Codex reasoning items as thinking-only 2026-06-13 14:35:00 -07:00
ITheEqualizer
57c2a55be4 fix(telegram): harden rich message fallback handling
Carry forward focused follow-ups from PR #45741: treat PTB's raw Bot API 10.1 response shapes safely, recognize real missing-endpoint errors, preserve link preview settings on rich sends, and lock the rich limit to Telegram's character-based cap.
2026-06-13 14:34:53 -07:00
Teknium
c8e5f34f24
fix(gemini): strip native self prefixes before generateContent (#36141)
Strip `google/` and `gemini/` self-prefixes before native Gemini generateContent calls, and keep provider-normalization expectations aligned.
2026-06-13 13:47:08 -07:00
briandevans
7d11fa4e9e fix(codex-responses): let final_answer complete top-level incomplete responses 2026-06-13 13:45:29 -07:00
ITheEqualizer
7c0605bf22 fix(telegram): preserve rich formatting on stream final 2026-06-13 13:44:45 -07:00
achaljhawar
819def44c7 fix(agent): scope Nous tags to Nous auxiliary calls 2026-06-13 13:24:40 -07:00
Teknium
08890d77e6
fix(plugins): normalize browser-pasted GitHub repo URLs (#33539)
Accept common GitHub web URLs in `hermes plugins install` by normalizing repository views back to cloneable `.git` URLs, with focused parser coverage.
2026-06-13 13:23:59 -07:00
kshitijk4poor
63097ee0d7 test(gateway): cover auto-resume full-path no-regression; clarify guard docstring
The salvaged fix's two regression tests mock adapter.handle_message, so
they only assert the pre-claimed sentinel is set/cleaned around a stub —
they never drive the real dispatch chain. Add a full-path test that
exercises _schedule_resume_pending_sessions -> _guarded_handle_message ->
adapter.handle_message -> _process_message_background -> _handle_message
and asserts the resumed session's agent runs EXACTLY ONCE: not zero (the
pre-claim must not self-bounce the resume into a queued no-op) and not
twice (the duplicate-agent bug #45456 the fix targets). Also assert no
leaked sentinel and no orphaned pending event after the drain settles.

Tighten the _guarded_handle_message docstring: on current main the real
sentinel is taken over inside _handle_message (not _process_message_background),
and note the `is _AGENT_PENDING_SENTINEL` guard only releases the slot we
ourselves placed, never one a live run owns.
2026-06-13 23:39:35 +05:30
liuhao1024
6e2fd955ca fix(gateway): claim session slot before auto-resume task to prevent duplicate agents
When the gateway restarts and auto-resumes an interrupted session, an
inbound message arriving in the window between `asyncio.create_task()`
and the task's first await could spin up a second AIAgent for the same
session.  Both agents would then process messages concurrently,
producing interleaved duplicate responses (#45456).

Fix: set `_AGENT_PENDING_SENTINEL` in `_running_agents` immediately
after the "already running" check, before creating the task.  This
closes the race window — any inbound message sees the slot as occupied
and queues behind the auto-resume.

A `_guarded_handle_message` wrapper ensures the pre-claimed sentinel is
always released, even if `handle_message` raises before reaching
`_process_message_background` (whose `finally` block handles normal
cleanup).

(cherry picked from commit 85150c976b)
2026-06-13 23:36:51 +05:30
helix4u
78c11d99e3 fix(update): stop Windows gateways before mutating install 2026-06-13 10:46:08 -07:00
ashishpatel26
957a8ffa88 fix(bedrock): omit sampling params for restricted Claude models
Bedrock Converse rejects non-default sampling parameters for Opus 4.7 and 4.8 with a ValidationException. Reuse the Anthropic-native sampling-param guard in the Bedrock kwargs builder so those models omit temperature/topP while older Claude and non-Claude models keep existing behavior.

Includes the stop-sequence regression from the parallel fix to ensure stopSequences still pass through for restricted Opus models.

Co-authored-by: Tranquil-Flow <tranquil_flow@protonmail.com>
2026-06-13 10:45:56 -07:00
WompaJango
28bf8fb47d feat(dashboard): clone profiles from any source 2026-06-13 07:33:58 -07:00
Que0x
3380563d94 fix(security): stop /api/status leaking host paths and PID on gated binds
The dashboard's public /api/status liveness endpoint is in PUBLIC_API_PATHS
and bypasses dashboard auth, yet it returned absolute hermes_home,
config_path, env_path, the gateway PID, and the internal gateway health URL.
That exceeds the shape its own allowlist documents as public ("version,
gateway state, active session count, and the dashboard auth-gate shape. No
bodies, no session content, no secrets"), leaking deployment recon to any
unauthenticated caller on a network-exposed (gated) bind.

Withhold host-local detail unless the bind is loopback / --insecure, where
the dashboard is local-only and the caller is already inside the trust
envelope -- the same split should_require_auth draws. The NAS liveness probe
and the auth-gate badge are unaffected.

Adds invariant tests for both modes (gated withholds, loopback keeps).
2026-06-13 07:18:59 -07:00
Teknium
ad7436a5d9 fix(gateway): preserve WeCom per-group sender allowlists
Keep the own-policy fail-closed hardening from PR #45444, but still trust WeCom groups.<id>.allow_from because the adapter already checked that sender allowlist before dispatching to gateway auth.
2026-06-13 07:18:54 -07:00
Que0x
fc46354580 fix(security): fail closed when an own-policy gateway adapter has no allowlist
Own-policy adapters (WhatsApp, WeCom, Weixin, QQBot, Yuanbao) default dm_policy/group_policy to "open", which forwards every sender. The gateway's adapter-trust shortcut in _is_user_authorized blanket-trusted those platforms when no env allowlist was set, so an operator who enabled one with only credentials authorized the entire external network -- the fail-open SECURITY.md section 2.6 forbids ("an allowlist is required for every enabled network-exposed adapter").

Trust the adapter only when its effective policy for the chat type is an actual "allowlist" restriction (the case #34515 was protecting). "open"/"pairing"/anything else falls through to default-deny, where {PLATFORM}_ALLOW_ALL_USERS / GATEWAY_ALLOW_ALL_USERS and the pairing flow remain the explicit opt-ins.
2026-06-13 07:18:54 -07:00
Teknium
1185dfd773 test: cover legacy Office document extensions 2026-06-13 07:18:37 -07:00
Tranquil-Flow
4fd9397ae3 fix(codex): drop extra_headers for chatgpt.com backend 2026-06-13 07:13:24 -07:00
Sarvesh
45f9099e51 fix(matrix): preserve markdown table structure 2026-06-13 06:57:08 -07:00