hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-01 12:02:05 +00:00

Author	SHA1	Message	Date
teknium1	c23f394eb8	fix: satisfy ruff encoding + windows-footgun lints for cgroup reaper - read_text(encoding='utf-8') (PLW1514) - # windows-footgun: ok on signal.SIGKILL — module is Linux-only (reads /proc, /sys/fs/cgroup; runs from a systemd unit) - test lambda accepts the new encoding kwarg	2026-06-28 02:05:50 -07:00
teknium1	86ec979f66	chore(release): map PRATHAMESH75 author for #37550 salvage	2026-06-28 02:05:50 -07:00
PRATHAMESH75	e551da6ddb	fix(gateway): reap cgroup orphans via ExecStopPost to unblock restart Long-lived helpers spawned indirectly by tool calls (adb, platform bridges) were left in the service cgroup after the gateway's main process exited. When the kernel rejected the deferred cgroup-wide kill with EINVAL, systemd blocked Restart=always for 6+ minutes, taking down all platforms and cron windows (#37454). Add a small ExecStopPost helper (gateway.cgroup_cleanup) that walks cgroup.procs and sends per-PID SIGKILLs — a different kernel code path than cgroup.kill, so it succeeds where the cgroup-wide write failed. KillMode=mixed is preserved so the gateway still reaps its own tool-call children before systemd intervenes (#8202). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-28 02:05:50 -07:00
Coy Geek	d7a1052424	fix(env-passthrough): fail closed when provider blocklist import fails When tools.environments.local can't be imported (partial install, import-time error), _is_hermes_provider_credential() returned False — fail-open. A skill could then register a Hermes provider credential (ANTHROPIC_API_KEY, etc.) as env passthrough; _scrub_child_env lets passthrough vars bypass the secret-substring net (rule 1), so the operator's real key would land in the execute_code child. Reopens the GHSA-rhgp-j443-p4rf bypass. Fail closed instead: on import failure, treat the name as a protected provider credential and refuse passthrough. Regression test exercises the full register -> scrub path under a simulated import failure. Co-authored-by: Hermes Agent <noreply@nousresearch.com>	2026-06-28 02:05:43 -07:00
teknium1	58c36b1798	fix(api-server): widen error redaction to cron-endpoint + SSE sites Follow-up to the salvaged #37733 fix. The contributor centralized redaction at _openai_error and the chat/responses failure paths, which covers the OpenAI-compatible envelopes transitively. Two sibling classes crossed the same authenticated HTTP boundary unredacted: - 8x cron-management endpoints returning {"error": str(e)} on 500 - the session-chat SSE error event ({"message": str(exc)}) Route both through the same _redact_api_error_text(force=True) helper. Add AUTHOR_MAP entry for coygeek and a TestRedactApiErrorText guard covering mask/force/limit/passthrough behavior.	2026-06-28 02:05:38 -07:00
Coy Geek	5e774de76e	fix(api-server): redact provider errors at HTTP boundary Force API-server error text through the existing secret redactor before returning OpenAI-compatible errors, response fallback text, response snapshots, and run failure events. This prevents credential-shaped provider failure text from crossing the API-server boundary while preserving debuggable sanitized messages.	2026-06-28 02:05:38 -07:00
HexLab98	d2fda5925d	test(gateway): cover Discord/Slack compression status suppression (#39293 )	2026-06-28 14:35:32 +05:30
HexLab98	d2ea948bc0	fix(gateway): suppress compression status noise on Discord and other chats (#39293 ) Extend the gateway noisy-status filter beyond Telegram so internal compression lifecycle messages stay in logs instead of spamming Discord, Slack, and other messaging channels.	2026-06-28 14:35:32 +05:30
teknium1	9f7d520caf	docs: add infographic for #36664 WhatsApp LID session-path fix	2026-06-28 02:05:26 -07:00
teknium1	3aaa98dd01	test(whatsapp): cover LID allowlist match on modern session layout Add an _is_user_authorized E2E for the platforms/whatsapp/session layout on top of fesalfayed's resolver fix (#36665) — guards the actual silently-dropped-LID-sender path from #36664.	2026-06-28 02:05:26 -07:00
fesalfayed	263ffec1b0	fix(whatsapp): resolve LID aliases on modern platforms/ session layout expand_whatsapp_aliases hardcoded get_hermes_home()/whatsapp/session, but the adapter writes lid-mapping files via get_hermes_dir("platforms/whatsapp/ session", "whatsapp/session"). On installs without the legacy directory the two paths diverge, so the resolver finds no mappings and returns the bare LID, which misses the allowlist and silently drops the message. Resolve through the same helper so both sides stay in lockstep on new and legacy layouts.	2026-06-28 02:05:26 -07:00
teknium1	d0f087e7f9	docs: add infographic for #36109 empty-400 diagnostics	2026-06-28 02:05:20 -07:00
xxxigm	093f567f0d	fix(agent,cli): surface empty-body API errors and fail oneshot exit code When an LLM API call returns HTTP 4xx with an empty parsed SDK `body` ({}), `_summarize_api_error` fell through to a bare `str(error)`, so users saw only "HTTP 400" with no provider detail (reported on Windows in #36109). The SDK leaves `body` empty in this case, but the httpx `response` still carries the payload in `.text`. - run_agent.py `_summarize_api_error`: when `body` is empty, fall back to `response.text` — parse a JSON `error.message`/`message` when present, else surface the raw (truncated) body. Platform-agnostic diagnostics. - hermes_cli/oneshot.py: `hermes -z` now runs via `run_conversation` and returns exit code 2 when the run is failed/partial with no usable final response, so scripts can detect LLM failures (still 0 when a response — incl. an error summary as output — is produced). Tests: new tests/run_agent/test_summarize_api_error.py (empty-body JSON + raw text, RED/GREEN verified) + oneshot exit-code/`run_conversation` wiring tests. NOTE: #36109's original root cause (Windows "all providers return empty 400") is not reproducible on current main (heavy provider-transport churn since v0.15.1). This change does not claim to fix that root cause — it makes any empty-body API error LEGIBLE so a future occurrence shows the real provider message instead of a bare HTTP 400. Relates to #36109 (does not close it).	2026-06-28 02:05:20 -07:00
teknium1	c0b4a3438a	fix(install): scope Playwright override to too-new apt releases + keep step interruptible Follow-up on #54032 for #35166: - Gate the PLAYWRIGHT_HOST_PLATFORM_OVERRIDE retry on the host being an apt release newer than Playwright recognizes (Ubuntu >24.04 / Debian >13) via playwright_host_unrecognized(), instead of retrying on ANY install failure. A network/disk/permission failure on a supported host now surfaces unchanged rather than getting a mismatched-glibc build forced onto it. - detect_os() now captures DISTRO_VERSION from os-release. - Fold in the interruptibility fix (was PR #35304, self-closed): wrap the download in 'timeout --foreground -k 10' (probed, with plain-timeout fallback) so a terminal Ctrl+C reaches the child and a wedged download is force-killed after the deadline. - Add behavioral tests that source the helpers and assert the retry fires only on Ubuntu 26.04 / Debian 14, not on supported hosts, non-apt distros, native-success, operator-pinned override, or unsupported arch.	2026-06-28 02:05:18 -07:00
kshitijk4poor	a28fe788a6	fix(install): retry Playwright install with platform override on unrecognized host (#35166 ) On apt releases newer than the bundled Playwright recognizes (Ubuntu 26.04, Debian 14, and future distros), 'npx playwright install --with-deps chromium' hangs uninterruptibly at 'Installing Playwright Chromium with system dependencies' because Playwright's resolver maps the host to a platform with no download build (#35166). Wrap every installer Playwright call in run_playwright_install(), which tries the native install first and, only if it fails or times out, retries once with PLAYWRIGHT_HOST_PLATFORM_OVERRIDE pinned to the newest known build (ubuntu24.04-<arch>). This is the escape hatch Playwright's maintainers bless for unrecognized platforms (microsoft/playwright#33434). Try-native-first (not a hardcoded distro/version table) is deliberate: - Self-correcting — when Playwright already supports the host (e.g. Ubuntu 26.04 on Playwright >=1.61) the first attempt succeeds and the override is never applied, so we never force a mismatched-glibc build onto a release Playwright handles correctly (microsoft/playwright#35114). - Zero-maintenance — new distro releases work the moment Playwright adds them. - Covers Debian 14+ and future releases, not just Ubuntu 26.04. An operator-set PLAYWRIGHT_HOST_PLATFORM_OVERRIDE is always respected (applied to the first attempt; retry skipped). Non-x64/arm64 arches have no fallback build and skip the retry. Refs #35166	2026-06-28 02:05:18 -07:00
teknium1	64972b6403	fix(config): canonicalize model.name/model.model to model.default (#34500 ) A custom_providers config that names the model under model.name (or model.model) resolved to an empty model, so the API request went out with model= — HTTP 400 from OpenAI-compatible backends. Display paths (hermes status/dump) already read model.name and showed the model, making the failure silent. The model id was read via 'default or model' at ~14 independent sites (cli, gateway, cron, curator, oneshot, fallback, profiles, ...), none of which honored 'name'. Rather than patch every site, canonicalize at the single load/save chokepoint: _normalize_root_model_keys() now promotes model.model/model.name -> model.default (precedence default > model > name) and drops the stale alias, so every reader — present and future — sees a populated default and config.yaml is migrated canonical on next save. The gateway, which bypasses load_config(), replays the same normalization in _load_gateway_config(). Co-authored-by: Bartok9 <danielrpike9@gmail.com> Credit: root-cause analysis and fix direction from @Bartok9 (#34502, first) and @v86861062 (#34527).	2026-06-28 02:05:13 -07:00
Teknium	2ecb6f7fe6	fix(telegram): clear send_path_degraded on successful reconnect (#35205 ) (#54076 ) * fix(telegram): clear send_path_degraded on successful reconnect _send_path_degraded was cleared only in _verify_polling_after_reconnect, 60s after reconnect and only if scheduled. A clean start_polling() reconnect left the flag stuck True, short-circuiting send() and blocking all outbound messages until the deferred probe ran (or forever if it never did). Clear the flag the moment start_polling() succeeds — that is the recovery signal. The deferred probe remains a defensive re-check that re-enters the reconnect ladder (re-setting the flag) if it detects a silent wedge. Fixes #35205. * docs: add infographic for #35205 telegram send-path fix	2026-06-28 01:38:17 -07:00
Teknium	674e16e7c6	fix(redact): stop DB-connstr redaction from corrupting code output (#33801 ) (#54061 ) Secret redaction is display/output-scoped on main — write_file writes content verbatim, terminal/execute_code redact only output not the command/source. The real bug is in displayed tool OUTPUT (read_file, terminal, execute_code): _DB_CONNSTR_RE's password group [^@]+ was greedy across newlines, so on a multi-line block it scanned past the DSN line to the next stray '@' (a Python @decorator), replacing every intervening character — including line breaks — with *. That dropped lines and concatenated the next line onto the f-string line, making read_file output look corrupted (the file on disk was always correct). Reported in #33801. Fix: - Forbid whitespace in the userinfo/password groups ([^:\s]+ / [^@\s]+) so the match can never span a line break. A real DSN password never contains whitespace. This alone kills the catastrophic line-dropping. - Under code_file=True, preserve a password group that is a pure {...} brace expression — f"postgresql://{user}:{pass}@{host}" is an f-string template, not a live credential. Literal passwords are still masked. - Pass code_file=True at the terminal and execute_code output redaction call sites (file_tools already did) so code-execution output isn't corrupted by ENV/JSON/template false positives. Real prefixes, auth headers, JWTs, and private keys are still redacted. Verified E2E against the reporter's exact pydantic-settings module: file written verbatim, read_file shows the DSN f-string + @model_validator intact with zero * corruption, while a literal postgresql://admin:pw@host DSN and a real sk- key are still masked. Reported-by: koishi70 Reported-by: pfrenssen	2026-06-28 01:15:39 -07:00
Teknium	de6e9ac760	docs(discord): document bot-to-bot comms as unsupported (#32791 ) (#54063 ) * docs(discord): document bot-to-bot comms as unsupported (#32791) Multi-profile bot-to-bot conversation is not a supported topology. DISCORD_ALLOW_BOTS=none (the default) blocks all bot-originated messages; setting mentions/all across multiple Hermes profiles to make them reply to each other ack-loops because Discord's reply auto-mention satisfies the mention gate every turn. Document the safe default and the loop hazard so operators don't wire it up. * docs(discord): infographic for bot-to-bot unsupported stance (#32791)	2026-06-28 01:15:34 -07:00
teknium1	4f16950e9a	docs: add infographic for #32421 content-filter fallback fix	2026-06-28 01:15:21 -07:00
teknium1	578e3989d4	fix(agent): route content-filter stream stalls to fallback chain (#32421 ) When a provider's output-layer safety filter (MiniMax "output new_sensitive (1027)", Azure content_filter, etc.) kills a streaming response after deltas were already sent, interruptible_streaming_api_call swallows the raw error into a finish_reason=length partial-stream stub. The conversation loop then burned 3 continuation retries against the SAME primary — re-hitting the content-deterministic filter every time — and gave up with "Response remained truncated after 3 continuation attempts", never consulting fallback_providers. Builds on @595650661's classifier change (cherry-picked) so error_classifier recognizes the filter; then: - chat_completion_helpers: run the swallowed error through error_classifier at the stub-creation point and stamp _content_filter_terminated on the stub (single source of truth — no parallel pattern list). - conversation_loop: read the tag and activate the fallback chain BEFORE burning any continuation retries; roll partial content back to the last clean turn and re-issue against the new provider (restart_with_rebuilt_messages). Plain network stalls are unaffected (only content_policy_blocked is tagged). Credits #32479 (@sweetcornna) and #33845 (@Tranquil-Flow) which fixed the same issue via the stub-tag and loop-escalation approaches respectively. Live E2E confirmed: before, _try_activate_fallback called 0x; after, fallback fires on the first stub and the fallback provider completes the turn.	2026-06-28 01:15:21 -07:00
595650661	b8e2268628	fix(agent): add MiniMax 'new_sensitive' to content_policy_blocked patterns The MiniMax output-layer safety filter surfaces the error verbatim as `output new_sensitive (1027)` (sometimes with additional provider wrapping like 'Stream stalled mid tool-call: output new_sensitive (1027)'). When the model emits a large tool-call argument block, the upstream filter trips and the SSE stream is truncated mid-flight, producing 'stream stalled mid tool-call' errors. Until now this case was misclassified and retried 3x on the same provider, reproducing the same refusal and burning paid attempts. Adding `new_sensitive` to `_CONTENT_POLICY_BLOCKED_PATTERNS` routes it through the existing is_client_error path: skip 3x retry, activate configured fallback model immediately, surface a clear provider-safety message to the user. Refs #32421	2026-06-28 01:15:21 -07:00
Teknium	c9df4bc094	fix(gateway): default restart_drain_timeout to 0 to kill systemd crash loop (#54066 ) A restart now interrupts in-flight agents immediately rather than holding the gateway open for a grace window. The previous 180s default coupled two independently-set timers: the gateway's own drain timer and systemd's TimeoutStopSec. On a stale unit where TimeoutStopSec < drain, systemd SIGKILLed the gateway mid-cleanup, leaving a stale lock that made the next startup exit immediately ('already running') — an infinite crash loop under Restart=on-failure (#31981). Setting drain to 0 makes the mismatch structurally impossible: with drain 0 the generated unit gets TimeoutStopSec=90 against a near-instant drain, so systemd never kills mid-cleanup. Contract: restart the gateway, in-flight work stops. A grace window large enough to 'save' a long agent turn would have to outlast an unbounded task, which is impossible. Also fixes the stale-unit warning's suggested command (hermes gateway service install --replace -> hermes gateway install --force); the former subcommand does not exist. Closes #31981	2026-06-28 01:14:34 -07:00
teknium1	0800f1c28b	infographic: whatsapp send-queue serialization (#33360 )	2026-06-28 01:10:14 -07:00
teknium1	cb9f855c2b	test(whatsapp-bridge): drop structural send-queue integration test The .integration.test.mjs greps bridge.js source text for the queue wiring — a change-detector that breaks on any benign refactor of the same code. The behavioral unit test (bridge.sendqueue.test.mjs) already covers FIFO ordering, error isolation, timeout propagation, and single-consumer concurrency, which is the contract that matters.	2026-06-28 01:10:14 -07:00
Tranquil-Flow	c393a8e55f	fix(whatsapp-bridge): serialize sendMessage to prevent cross-chat contamination (#33360 ) Concurrent sock.sendMessage() calls on a single Baileys socket can cause the WhatsApp protocol-level routing to misdeliver messages — responses intended for one chat appear in another. Add a promise-based send queue that serialises all sendMessage() calls across concurrent HTTP /send, /edit, and /send-media handlers so only one send is in-flight at a time. Includes unit tests for queue ordering, error isolation, timeout propagation, and single-consumer concurrency semantics, plus an integration check that the queue is wired into sendWithTimeout.	2026-06-28 01:10:14 -07:00
teknium1	1f72ad9be9	refactor(cli): extract interrupt recovery to a testable helper Pull the #33271 post-interrupt recovery (flush_stdin + _force_full_redraw) out of process_loop's finally block into _recover_terminal_after_interrupt(), and replace the inline-logic-copy tests with ones that exercise the real helper plus a source guard that process_loop still invokes it behind the _last_turn_interrupted gate.	2026-06-28 01:08:09 -07:00
zccyman	f3aaba7f85	fix(cli): recover terminal state after interrupt to prevent raw control sequence freeze When the agent is interrupted during processing, prompt_toolkit's renderer and VT100 input parser can be left in an inconsistent state. CSI 6n cursor position report responses leak as literal text (^[[19;1R) and the terminal stops accepting keyboard input. Fix: in process_loop's finally block, after an interrupted turn: - flush_stdin() to drain stray escape bytes from the OS input buffer - _force_full_redraw() to reset prompt_toolkit's renderer cache Closes #33271	2026-06-28 01:08:09 -07:00
teknium1	2e1b48ed31	chore: map kurlyk local email → skabartem for PR #32867 salvage	2026-06-28 01:08:04 -07:00
kurlyk	def97bcd96	fix: eliminate race condition in OpenAI client replacement Make check-and-replace atomic in _ensure_primary_openai_client by keeping both operations under the same lock acquisition. Previously, the lock was released between detecting a closed client and replacing it, allowing two threads to simultaneously replace the client. Fixes #32846 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-28 01:08:04 -07:00
teknium1	4a0fe4e54a	docs: add PR infographic for #32762 clarify-expiry fix	2026-06-28 01:07:53 -07:00
teknium1	aacc15b2c9	fix(clarify): raise default clarify_timeout to 3600s (#32762 ) The 600s default evicted the gateway clarify entry while users were still away (meeting/AFK); a later button tap then landed on a dead entry and the agent hung on 'running: clarify'. Raise the default to 1h in DEFAULT_CONFIG and the get_clarify_timeout() code-level fallback, documenting the running-agent-guard tradeoff. User overrides still win.	2026-06-28 01:07:53 -07:00
konsisumer	3f543229f2	fix(telegram): notify user when clarify button tap arrives after expiry	2026-06-28 01:07:53 -07:00
Teknium	90d25adc9e	fix(gateway): deliver profile-scoped cache media on symlinked HERMES_HOME (#54060 ) Generated images under a profile gateway's cache (profiles/<name>/cache/ images/...) were silently dropped from Telegram/Discord delivery when HERMES_HOME is symlinked under a denied prefix (e.g. /opt/data -> /root/.hermes) and $HOME is not that prefix. The resolved path lands under /root (a system denylist prefix), the root-home exception only fires when the denied prefix IS $HOME, and the static safe-roots list only covers the active HERMES_HOME's top-level cache — not per-profile cache dirs. Both gates fail, so validate_media_delivery_path returns None and the gateway logs 'Skipping unsafe MEDIA directive path'. _media_delivery_allowed_roots() now also enumerates per-profile cache roots (<root>/profiles/*/cache/{images,audio,videos,documents, screenshots}) at check time. Allowlist match runs before the denylist, so the profile artifact delivers regardless of the /root interaction; profile-dir credentials (auth.json) stay blocked since they aren't under a cache subdir. Reopened regression of #34485/#38108, neither of which covered the profile-scoped symlink case. Fixes #31733.	2026-06-28 01:07:28 -07:00
sweetcornna	2701ea2f0c	fix(agent): reopen fallback chain after primary recovery	2026-06-28 00:57:42 -07:00
teknium1	7b9ff310b6	fix: salvage #33830 for current main — relocate allow_bots bridge to telegram plugin hook, fix stale adapter import in test	2026-06-28 00:57:03 -07:00
sweetcornna	fc70d023d8	fix(telegram): apply bot auth policy to Telegram sources # Conflicts: # gateway/config.py	2026-06-28 00:57:03 -07:00
sweetcornna	002357a83f	fix(tui): repump stdin after readable handler errors	2026-06-28 00:53:29 -07:00
teknium1	3a03d03bdc	docs: add infographic for #30636 macOS state.db fix	2026-06-28 00:53:19 -07:00
teknium1	52d774f0f9	fix(state): F_FULLFSYNC barrier at WAL checkpoints on macOS (#30636 ) On Darwin, synchronous=FULL (the WAL default) only issues a plain fsync(), which Apple documents does NOT guarantee writes reach stable storage or stay ordered. SQLite's WAL corruption-safety guarantee assumes the OS honors the fsync barrier; macOS does not unless the app uses F_FULLFSYNC. During a launchd system shutdown the page cache is dropped (effectively power-loss for in-flight pages), so a WAL checkpoint whose fsync 'reported' durable may never hit the platter — corrupting state.db with a malformed image. That is the trigger in #30636 ('SIGTERM during launchd shutdown under high load'). Apply PRAGMA checkpoint_fullfsync=1 (macOS-guarded) in apply_wal_with_fallback. It forces the F_FULLFSYNC barrier only at checkpoint boundaries (where WAL frames land in the main DB), so cost amortizes to ~+0.1ms/commit vs ~+4ms for the broader fullfsync=1. No-op off Darwin (F_FULLFSYNC is macOS-only). Root-cause analysis by @catapreta on #30636. Supersedes #30654, whose synchronous=FULL is a no-op (already FULL in WAL mode) and whose TRUNCATE-on-close is already on main. Co-authored-by: catapreta <catapreta@users.noreply.github.com>	2026-06-28 00:53:19 -07:00
Gille	9229d0db17	fix(moa): preserve Nous provider identity for references	2026-06-28 00:47:15 -07:00
Teknium	7c38249c79	feat(moa): references see full tool state + fire on every user/tool response (#54016 ) The advisory reference view stripped all tool calls and tool results, so reference models judged a task whose actions and results they never saw — and references only fired once per user turn, never re-running as the agent's state advanced through the tool loop. Two fixes: - _reference_messages() now PRESERVES the agent's tool calls and tool results, rendering them inline as text ([called tool: ...] / [tool result: ...]) so a reference gives an informed judgement on the real current state. Still emits zero tool-role messages and zero tool_calls arrays (strict providers reject those), and large tool results are previewed head+tail (4000-char budget). The required end-on-user shape is met by APPENDING a synthetic advisory user turn — not by deleting the agent's latest context (which the prior fix did). - References now re-run on every state change — each new user message AND each new tool result — instead of once per user turn. The state-sensitive advisory signature drives the cache: new tool result = miss (re-run), identical-state re-call = hit (no re-run, no re-emit). The acting aggregator still receives the full, untrimmed transcript.	2026-06-28 00:30:11 -07:00
kshitijk4poor	fc7a01b6cb	test+harden: modernize salvaged Matrix path for current plugin layout Two follow-ups on top of the salvaged #46365 fix: 1. Tests: the salvaged tests injected the ephemeral MatrixAdapter via sys.modules["gateway.platforms.matrix"], but Matrix migrated to a plugin (#41112) and the fallback now imports from plugins.platforms.matrix.adapter. Point the three sys.modules patches at the current module path so the ephemeral-fallback tests actually exercise the injected fake adapter. 2. Harden the live-adapter lookup: split the gateway import guard from the adapter lookup and log (instead of silently swallowing) when a runner exists but adapters.get() raises. A silent fall-through there would re-introduce the per-send reconnect/OTK-exhaustion storm this fix exists to prevent (#46310). Documented that the live adapter is gateway-owned and must not be disconnected, and why the ephemeral finally never touches it.	2026-06-28 12:48:08 +05:30
liuhao1024	a7fd62d824	fix(send_message): reuse live gateway adapter for Matrix media sends When a live gateway adapter is available (i.e. the tool runs inside a running gateway), reuse the persistent connection instead of creating a new MatrixAdapter per call. This eliminates per-message E2EE re-init storms that exhaust recipient OTKs and silently drop messages. The fix follows the same pattern as _send_to_platform (line 618): gateway_runner_ref → runner.adapters[Platform.MATRIX]. Falls back to the ephemeral connect/disconnect cycle for standalone contexts. Also extracts the shared send logic into _send_via_matrix_adapter() to avoid duplicating the media dispatch code between the two paths. Fixes #46310	2026-06-28 12:48:08 +05:30
Ben Barclay	1466eab4ee	test(docker): wait for cont-init to finish before privilege-drop shim tests (#54026 ) The docker-exec privilege-drop shim tests started a sleep container and released the fixture as soon as `docker exec <c> true` returned 0. On s6-overlay that succeeds almost immediately — ~0.05s in measurement — long before the `01-hermes-setup` cont-init hook (docker/stage2-hook.sh) has finished seeding + `chown hermes:hermes` config.yaml and running the Python config migration (cont-init only fully settles at ~9.8s under arm64 QEMU emulation). `test_shim_opt_out_keeps_root` wipes config.yaml, writes it as root with HERMES_DOCKER_EXEC_AS_ROOT=1, and asserts root:root ownership. When the fixture released the test inside that ~10s window, stage2-hook's boot-time `chown hermes:hermes config.yaml` raced the root-written file and reset it to hermes:hermes — failing the assertion. The window is invisible on native amd64 (stage2-hook completes in a blink) but wide open under the arm64 build's QEMU emulation, which is why only build-arm64 flaked while build-amd64 stayed green. Replace the responsiveness poll with a wait on the canonical 'cont-init finished' signal: $HERMES_HOME/logs/container-boot.log gaining a `profile=default` line, written by 02-reconcile-profiles which s6 runs strictly after 01-hermes-setup. Mirrors the readiness pattern already used in test_container_restart.py. Also bumps the readiness timeout 20s->60s to cover slow emulation. No production code change — test-only hardening of a timing race.	2026-06-28 17:06:26 +10:00
Jeffrey Quesnelle	2c9b017696	Merge pull request #54000 from NousResearch/fix/desktop-main-cjs-clobber-stage-simple-git fix(desktop): stop hermes desktop from clobbering tracked main.cjs	2026-06-28 01:56:51 -04:00
Teknium	4f61d48aef	test(cron): deterministically wait for ticker, fix wall-clock flake (#54010 ) tests/cron/test_scheduler_provider.py spawned a background ticker thread, slept a fixed 0.2s, then asserted the loop had called tick()/heartbeat() at least N times. Under loaded CI the worker thread isn't always scheduled within that window, so the loop hadn't ticked yet — flaking with 'provider never called tick()' (assert 0 >= 1). Add a _wait_until(predicate, timeout) helper and replace all five fixed time.sleep(0.2) sites with a poll on the actual predicate (calls/beats count reached). Same contract assertions, no wall-clock dependence.	2026-06-27 22:52:29 -07:00
Teknium	1fa44180b0	fix(moa): advisory references end on a user turn + get a reference-role system prompt (#54007 ) * fix(moa): reference advisory view must end with a user turn MoA reference calls failed with Anthropic models that don't support assistant prefill (e.g. Claude Opus 4.8): '400 ... must end with a user message'. The advisory view built by _reference_messages() kept the last assistant turn's text while dropping the following tool result, leaving a trailing assistant turn — which Anthropic (and OpenRouter->Anthropic) interpret as an assistant prefill to continue. References are advisory and must end on the user turn they answer. Strip trailing assistant turns from the advisory view (preserving intervening ones). Update the existing test that encoded the buggy shape and add a mid-tool-loop regression test. * feat(moa): give reference models an advisory-role system prompt Reference models received the bare trimmed conversation with no role framing, so they assumed they were the acting agent and refused ("I can't access repositories/URLs from here") or tried to call tools they don't have. Prepend a dedicated advisory system prompt to every reference call: the model is an analyst, not the actor — it cannot execute, should not apologize for lacking tools, and should reason about the presented state to advise the aggregator/orchestrator on approach, next steps, tool-use strategy, risks, and anything the acting agent missed. Its output is private guidance for the aggregator, not a user-facing answer.	2026-06-27 22:52:25 -07:00
Teknium	2523917680	fix(tests): bare pytest flags pass through run_tests.sh without a '--' separator (#54008 ) The parallel runner only forwarded pytest args after a literal '--', so a bare 'scripts/run_tests.sh tests/foo.py -q' (or -v/-x/-k/--tb=long) errored out with 'unrecognized arguments'. This contradicted the docstring's promise that common pytest flags pass through, and forced a retry on every run that used pytest muscle-memory. Now any token starting with '-' that isn't one of the runner's own options (-j/--jobs, --paths, --slice, --file-timeout, --generate-slices, --files, --include-integration) is routed to each per-file pytest invocation automatically. Value-taking flags given space-separated (-k expr, -m mark, -p plugin, -o name=val, etc.) keep their value instead of having it stolen by positional-path discovery. The explicit '--' separator still works and stacks with bare flags. - scripts/run_tests_parallel.py: argv splitter routes bare unknown flags to pytest; value-flag lookahead; updated docstring. - scripts/run_tests.sh: usage comment reflects bare-flag passthrough. - tests/test_run_tests_parallel.py: 4 behavior-contract tests (bare -q runs, -k keeps its value/filters, '--' still works, positional path stays a root).	2026-06-27 22:43:26 -07:00
emozilla	2d206a3a42	fix(desktop): stop hermes desktop from clobbering tracked main.cjs (#52735 ) `npm run build` ended with `bundle-electron-main.mjs`, which esbuild-bundled electron/main.cjs and renamed the bundle on top of the tracked source file. Because every `hermes desktop` runs `npm run build`, each launch rewrote a checked-in source file (~7.5k-line source -> ~14.8k-line bundle), dirtying the working tree with a build artifact that `git restore` couldn't keep (the next launch re-clobbered it) and forcing autostash/restore conflicts on update. The bundle only existed to inline `simple-git` so the packaged app.asar (which ships no node_modules) wouldn't crash at launch with "Cannot find module 'simple-git'". Replace it with the mechanism the repo already uses for the other hoisted runtime dep (node-pty): stage the dependency closure and resolve it from process.resourcesPath at runtime. - stage-native-deps.cjs: resolve simple-git's runtime closure (walking dependencies + optionalDependencies, so a version bump that adds a transitive dep can't silently reintroduce the crash) and stage it under build/native-deps/vendor/node_modules/. The `vendor/` nesting is load-bearing: electron-builder drops a node_modules dir at the ROOT of an extraResources copy but keeps a nested one. - git-review-ops.cjs: fall back to the staged native-deps/vendor/node_modules/simple-git when the hoisted require() fails; dev runs resolve the hoisted copy and never hit the fallback. - package.json: drop the bundler from the `build` script so main.cjs is never a build target again. - nix/desktop.nix: drop the direct bundler call (the closure rides the existing `cp -rn native-deps` into $out) and patch process.resourcesPath in git-review-ops.cjs alongside main.cjs. - delete scripts/bundle-electron-main.mjs. Verified: electron-builder's own file filter keeps the full staged closure (0 dropped), and a packaged win-unpacked build launches with the git-review pane resolving simple-git from the staged vendor path.	2026-06-28 01:30:09 -04:00

1 2 3 4 5 ...

13304 commits