hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-25 17:18:11 +00:00

Author	SHA1	Message	Date
mnajafian-nv	d03cdd63eb	fix(cli): run one-shot query cleanup before lease release (#43036 ) * fix(cli): run one-shot query cleanup before lease release Signed-off-by: mnajafian-nv <mnajafian@nvidia.com> * test(cli): cover quiet one-shot cleanup finalization Signed-off-by: mnajafian-nv <mnajafian@nvidia.com> --------- Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>	2026-06-09 21:52:13 -07:00
Teknium	96af61b6ef	feat(memory,skills): approve/deny gate for memory + skill writes (#38199 ) Adds memory.write_mode and skills.write_mode (on\|off\|approve), applied to both foreground turns and the background self-improvement review fork — the source of the unprompted 'wrong assumption' saves users reported. - on (default): write freely, unchanged behaviour - off: never write; the tool returns a clean disabled result - approve: don't commit. Memory foreground writes prompt inline (small, reviewable in a chat bubble); background memory writes and ALL skill writes stage to a pending store instead (a SKILL.md is too large to review inline, and a daemon thread can't block on a prompt) Review staged writes from CLI or any messaging platform: /memory pending\|approve\|reject\|mode /skills pending\|approve\|reject\|diff\|mode Skill review respects the size asymmetry: inline you see a one-line gist; the full unified diff stays out-of-band (/skills diff, dashboard, or the staged JSON file). New: tools/write_approval.py (gate + pending store), hermes_cli/ write_approval_commands.py (shared CLI+gateway handlers). Gates wired at the single entry points memory_tool() and skill_manage(), using the existing write-origin ContextVar to distinguish foreground from background_review.	2026-06-09 21:51:43 -07:00
Teknium	f082b4ec5c	fix(ci): make parallel runner's exit-4 retry robust for newly-added test files (#42994 ) The per-file test runner re-runs a file once when pytest exits 4 ("file or directory not found") while the file exists on disk — a transient seen on loaded shared CI runners where the planner collects a file (--collect-only counts its tests) but the per-file subprocess fails to stat it moments later. A single immediate retry could land in the same brief high-load window and fail again, and the retry was gated on one Path.exists() check that can itself be a flaky stat under that load — so a freshly-added test file that LPT pins to one shard would deterministically red that shard on every run (no actual test failure; the file just never executes). - Extract the subprocess spawn/communicate/process-tree-kill logic into a shared _spawn_pytest_once() helper (removes ~90 lines of duplication between the primary run and the retry). - Replace the single-shot retry with a bounded backoff loop (_EXIT4_RETRY_ATTEMPTS, escalating sleep) that re-runs while the file is present on disk. - Add _file_present() which re-checks existence across a few spaced stats, so a single flaky negative stat doesn't wrongly conclude the file is missing. A genuinely-missing file (typo/deleted) still fails fast — exit 4 is not swallowed when the file truly does not exist. - Tests: transient-then-pass recovery, genuinely-missing fails fast with no retry, give-up after max attempts, and _file_present transient/missing cases.	2026-06-09 21:39:09 -07:00
Ben Barclay	5cf6e28a2f	fix(gateway): auto-start after container restart via planned-stop marker (#42675 ) (#43236 ) * fix(gateway): auto-start after container restart via planned-stop marker On Docker (s6-overlay), the gateway runs as a dynamically-registered s6 service. When the container stops/restarts/upgrades, s6 sends the gateway a plain SIGTERM. The shutdown path (_stop_impl) ended with an unconditional _update_runtime_status("stopped"), persisting gateway_state=stopped to the volume. container_boot.py reads that on the next boot and only auto-starts gateways whose last state was "running" (_AUTOSTART_STATES) — so after a routine `docker compose up --force-recreate` the gateway stays down and messaging channels silently go dark, with no error surfaced (issue #42675). The codebase already distinguishes intentional stops from unexpected signals via the planned-stop marker (write_planned_stop_marker / consume_planned_stop_marker_for_self): `hermes gateway stop`, systemd/launchd ExecStop, and Ctrl+C write a marker before signalling, so the handler classifies them as planned. An unmarked SIGTERM (container/s6 restart, OOM, bare kill) is signal-initiated. This wires that existing classification through to the state persist, rather than adding unreliable signal-source inference: - run.py: GatewayRunner._signal_initiated_shutdown, set in shutdown_signal_handler's unmarked-signal branch. In _stop_impl, a signal-initiated (non-restart) teardown now persists "running" instead of "stopped" — preserving the operator's run-intent and overwriting the mid-shutdown "draining" marker so _AUTOSTART_STATES matches on reboot. Operator stops and restarts persist "stopped" as before. - service_manager.py: S6ServiceManager.stop() now writes the planned-stop marker for the supervised PID (read from s6-svstat) before `s6-svc -d`, so an in-container `hermes gateway stop` is correctly classified as intentional (parity with the systemd/launchd/host stop paths, which already mark). Best-effort: a marker-write failure falls back to the safe signal-initiated path. Tests: shutdown persist-decision table (signal→running, operator→stopped, restart→stopped), s6 stop marker write + svstat PID parse + failure tolerance. The signal→running and s6-marker tests fail without the respective source change. Verified end-to-end against a container built from this branch: an unmarked SIGTERM to the live gateway leaves gateway_state=running (shutdown-context log confirms signal path); existing real container-restart suite still green. * docs(docker): clarify gateway autostart distinguishes operator-stop from container-kill The per-profile-supervision section described the autostart-across-restart contract as "running gateways come back, stopped stay stopped" without spelling out what records 'stopped'. That contract was the source of #42675 confusion: users expected a restart to bring the gateway back and it didn't. With the write-side fix, only an explicit `hermes gateway stop` records 'stopped'; container/s6 restart SIGTERMs (incl. image upgrades and unexpected exits) leave the state 'running' so the gateway auto-starts. Make that distinction explicit in both the multi-profile and per-profile-supervision sections. * test(docker): real-restart autostart E2E for #42675 Adds test_live_gateway_autostarts_after_real_restart_without_manual_state_stamp: a live s6-supervised gateway is killed by an actual `docker restart` SIGTERM (no manual gateway_state stamp, no planned-stop marker) and must auto-start on the next boot. Exercises the WRITE side of the fix that the existing stamp-based tests bypass. Verified to FAIL against an origin/main image (reconciler logs prior_state=stopped action=registered — the #42675 bug) and PASS against the fixed image (prior_state=running action=started).	2026-06-10 14:01:34 +10:00
Siddharth Balyan	b4170f3ac2	fix(cron): don't strict-scan script-injected output in no-skills jobs (#43223 ) The runtime assembled-prompt scan (#3968 lineage) selected its pattern tier on has_skills alone. A script-driven, no-skills job injects its script's stdout into the prompt, and that blob was scanned with the STRICT user-prompt pattern set — so any command-shape string in the data feed (e.g. a triage bot ingesting a bug report that quotes `rm -rf /`) hard-blocked the job on every tick. Script output and context_from output are runtime DATA produced by operator-authored code — the same trust class as install-vetted skill markdown, not a user-authored directive prompt. Select the scan tier by what the assembled prompt CONTAINS: when it includes skill content OR injected data, use the looser _scan_cron_skill_assembled set (keeps unambiguous injection directives, drops command-shape patterns, sanitizes invisible unicode instead of blocking). Defense-in-depth is preserved: - The raw user prompt is still strict-scanned at create/update (api_server paths untouched) AND re-scanned strict at runtime even when the looser tier was selected for the data blob. - Plain no-script/no-skills jobs keep the strict scan on the whole assembled prompt. - Injection directives arriving via script stdout still block. Rejected alternative: removing destructive_root_rm from the strict set or a per-job skip_injection_scan flag — both weaken the guard globally.	2026-06-10 08:27:24 +05:30
Ben Barclay	7df3aa34b1	fix(dashboard-auth): warn when public_url override is silently rejected (#43214 ) A non-empty HERMES_DASHBOARD_PUBLIC_URL / dashboard.public_url value that fails URL validation (overwhelmingly: a missing http(s):// scheme, e.g. "hermes.domain.com") was silently discarded by resolve_public_url(), falling back to reconstructing the OAuth redirect_uri from request headers. Behind a reverse proxy that doesn't forward X-Forwarded-Proto reliably, that yields an http:// callback even though the operator explicitly set the public URL — with no signal as to why (#42780). Emit a deduplicated operator-facing WARNING (once per distinct value, since resolve_public_url runs per request) naming the offending value and the required scheme. Turns a silent footgun into a self-diagnosing one; behaviour is otherwise unchanged. Tests assert the warning fires for a scheme-less value, is deduplicated across repeated calls, and stays silent for a valid value — all three fail without the fix.	2026-06-10 12:14:57 +10:00
BROCCOLO1D	29036155ce	fix(terminal): lazy-parse docker env config (#42733 ) Co-authored-by: BROCCOLO1D <279959838+BROCCOLO1D@users.noreply.github.com>	2026-06-10 11:04:27 +10:00
xxxigm	93340fa3c1	fix(tui_gateway): honor target profile's terminal.cwd on desktop profile switch (#40892 ) * fix(tui_gateway): honor target profile's terminal.cwd on desktop profile switch The desktop's app-global remote mode serves every profile from one tui_gateway backend, so the process-global TERMINAL_CWD only reflects the launch profile. After switching profiles, a new session resolved its workspace from that stale env var and inherited the previous profile's directory. Add _profile_configured_cwd() to read a non-launch profile's own terminal.cwd from its config.yaml (skipping placeholder/empty/missing and non-existent paths so callers fall back cleanly), and wire it into _completion_cwd() with precedence: explicit client cwd -> existing session cwd -> bound profile's configured cwd -> TERMINAL_CWD -> os.getcwd(). Fixes #40334 * test(tui_gateway): cover per-profile cwd resolution (#40334) Pin the new contract: _profile_configured_cwd reads a profile's own terminal.cwd and rejects placeholders/missing paths, and _completion_cwd prefers a bound profile's cwd over a stale launch-profile TERMINAL_CWD while still letting an explicit client cwd win.	2026-06-09 19:45:29 -05:00
brooklyn!	aecdacb11b	Merge pull request #43109 from NousResearch/fix/desktop-remote-attach-drops fix(desktop): stage dropped files into the remote session workspace	2026-06-09 19:22:11 -05:00
Brooklyn Nicholson	7ffc216bc0	fix(agent): make a binary @file: reference actionable instead of a dead end A binary @file: ref (PDF, docx, spreadsheet, …) expanded to a bare "binary files are not supported" warning with no content. The model saw a failure and gave up — e.g. a dropped PDF came back as a text note claiming the type was unsupported, even though the file was staged on disk right next to it. Inject an actionable content block instead: the path, mime type, size, and a nudge to use its tools to read/convert/view the file (and explicitly not to tell the user the type is unsupported). General across every binary type — not PDF-specific. The file already resolves where the agent's tools run (local cwd or the staged copy in a remote session workspace), so it can act on it directly.	2026-06-09 19:16:46 -05:00
brooklyn!	218452b050	fix(state.db): recover from malformed sqlite_master so hidden sessions reappear (#43149 ) * fix(state.db): recover from malformed sqlite_master so hidden sessions reappear The corruption class behind "Desktop/Dashboard show no sessions while hundreds of session files sit on disk" is a malformed sqlite_master — most often a duplicate object row, e.g. two CREATE VIRTUAL TABLE messages_fts entries — surfacing as: sqlite3.DatabaseError: malformed database schema (messages_fts) - table messages_fts already exists SQLite parses the whole schema while preparing the FIRST statement on a connection, so on this class every statement fails before it runs: PRAGMA journal_mode (which is where SessionDB.__init__ actually trips, in apply_wal_with_fallback, BEFORE _init_schema), PRAGMA integrity_check, and even DROP TABLE. The only operations that still work are PRAGMA writable_schema=ON plus direct sqlite_master surgery. A plain FTS-index rebuild at the _init_schema layer therefore cannot reach or fix this; the canonical sessions/messages rows are intact — only the derived schema is broken. Add a dedicated recovery that operates where the failure actually happens: - hermes_state.repair_state_db_schema(): backs up the raw file first, then a least-destructive ladder — (1) de-duplicate sqlite_master keeping the lowest rowid per object (preserves the existing FTS index), escalating to (2) drop every messages_fts* schema object + VACUUM and let the next open rebuild the FTS index from messages. sessions/messages are never modified. Plus is_malformed_db_error() to discriminate this class. - SessionDB.__init__ auto-heals: on a malformed-schema open error it repairs once (process-guarded against loops / concurrent web_server opens) and reopens, so Desktop/Dashboard recover on their own instead of silently showing "no sessions". - hermes doctor --fix detects the malformed class and repairs it (reporting the recovered session count + backup name). - hermes sessions repair [--check-only] [--no-backup] runs on the raw file path, since SessionDB() itself cannot open a malformed DB. Supersedes #32589 and #33869: both targeted FTS corruption but gated their repair behind statements (integrity_check / SELECT / DROP TABLE) that themselves fail on this class, and neither addressed the apply_wal_with_fallback open-time failure. Credit preserved via Co-authored-by. Closes #33865. Co-authored-by: João Vitor Cunha <145560011+plcunha@users.noreply.github.com> Co-authored-by: Tuna Dev <273476039+tuancookiez-hub@users.noreply.github.com> * test(state.db): cover strat-B escalation + unrepairable safe-fail paths --------- Co-authored-by: João Vitor Cunha <145560011+plcunha@users.noreply.github.com> Co-authored-by: Tuna Dev <273476039+tuancookiez-hub@users.noreply.github.com>	2026-06-09 18:49:08 -05:00
Teknium	57c6714995	fix(models): keep curated Anthropic aliases in /model picker (#43103 ) The Anthropic picker returned the live /v1/models dump verbatim whenever credentials were configured. Anthropic's API lags newly-routed curated aliases (e.g. claude-fable-5, reachable on Anthropic before the models endpoint enumerates it), so the curated entry vanished from the picker. Merge curated _PROVIDER_MODELS["anthropic"] with the live catalog — curated first, live-only appended, deduped — mirroring the OpenAI curated-merge path. Live failure / no creds falls back to curated verbatim.	2026-06-09 14:45:19 -07:00
brooklyn!	8d71c38919	fix(desktop): rebind sessions after websocket reconnect (salvage of #41740 ) (#43004 ) * fix(desktop): rebind sessions after websocket reconnect * docs(desktop): explain the reconnect-resume guard in use-route-resume The reconnect fix turns on two subtle conditions with no inline rationale: `seenGatewayStateRef` suppresses a spurious "became open" on the first effect run (so a session mounting with the gateway already open doesn't double-resume), and the `gatewayBecameOpen \|\|` arm forces a re-resume even when the route looks `alreadyActive` because the cached runtime id can be stale after the gateway rebinds/reaps the session. Comment both so the next reader doesn't "simplify" them back into the original bug. No behavior change. --------- Co-authored-by: Josh Dow <josh.dow@prepad.io>	2026-06-09 19:01:00 +00:00
Siddharth Balyan	46fedef07f	fix(openrouter): never send reasoning field for adaptive Anthropic models (#43012 ) The previous fix (#42991) only omitted reasoning when it was being disabled. But reasoning-mandatory Anthropic models (Claude 4.6+, fable) 400 with thinking.type.disabled on EVERY tool-continuation turn even when reasoning is enabled: chat_completions never replays signed thinking blocks, so the prior assistant tool_call has no thinking, and OpenRouter resolves "reasoning requested but history has none" by emitting thinking.type.disabled — which these models reject. Result: first turn works, every turn after the first tool call dies (HTTP 400, non-retryable). OpenRouter ignores reasoning.effort for adaptive Anthropic models anyway (the model self-decides), so the reasoning field is pointless for them on every turn and harmful on tool-replay turns. Omit it entirely → adaptive default. - openrouter profile: drop the reasoning field for reasoning-mandatory Anthropic models regardless of enabled/disabled; legacy Anthropic + non-Anthropic models unchanged. - tests: assert omission across enabled/disabled/effort variants; parity tests switched to a non-Anthropic reasoning model (deepseek) since Anthropic 4.6+ no longer carries a reasoning field. Verified live end-to-end: a tool-replay turn on anthropic/claude-fable-5 with reasoning enabled now builds extra_body=None and returns HTTP 200 (was 400).	2026-06-10 00:18:23 +05:30
brooklyn!	ba44de06da	fix(install): self-heal a stuck Electron download (salvage of #42894 ) (#42998 ) * fix(install): self-heal a stuck Electron download on the desktop build The desktop build downloads Electron (~114MB) from GitHub. A corrupt cached zip, or a blocked/throttled GitHub release host (the repeating "retrying" log), hard-failed the install — and install.sh had no recovery at all while install.ps1 / `hermes desktop` only purged the cache. All three build paths now escalate on a failed `npm run pack`: GitHub → purge corrupt electron-.zip + stale -unpacked and retry → one retry via a public Electron mirror (npmmirror.com). @electron/get SHASUM-verifies the download, and a user-pinned ELECTRON_MIRROR is always respected (never overridden). Adds a bash clear_electron_build_cache()/_desktop_pack() to mirror the existing PowerShell/Python helpers. * test(install): cover the Electron mirror fallback Verify `hermes desktop` falls back to a mirror when the cache purge finds nothing, and that a user-pinned ELECTRON_MIRROR is respected (no extra attempt, not overridden). * docs(desktop): troubleshoot a stuck Electron download Document the automatic cache-purge + mirror fallback, how to pin your own ELECTRON_MIRROR, and how to clear a corrupt cached zip by hand. * docs(install): correct the Electron mirror trust framing The mirror-fallback comments and the desktop troubleshooting doc implied `@electron/get`'s SHASUM check makes the npmmirror.com download safe against tampering. It doesn't: the SHASUMS256.txt is fetched from the same mirror, so the check guards against a corrupt/partial download, not a compromised mirror. Reframe all four surfaces (install.sh, install.ps1, `hermes desktop`, and the docs) to state the trust trade-off honestly — npmmirror.com is the de-facto Electron community mirror, we only fall back to it after the canonical GitHub download fails, and a user-pinned ELECTRON_MIRROR is never overridden. No behavior change. --------- Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com>	2026-06-09 18:19:14 +00:00
Siddharth Balyan	1febb08240	fix(anthropic): default new Claude models to the modern thinking contract (#42991 ) New Anthropic models without a recognized version substring (claude-fable-5 and future named/numbered releases) were classified as legacy and routed down the manual-thinking path, which made OpenRouter emit thinking.type.disabled — a form reasoning-mandatory Claude models reject with a non-retryable HTTP 400. Invert the brittle version-substring allowlists to default-to-modern (mirroring _get_anthropic_max_output): unknown Claude models get the adaptive/xhigh/ no-sampling contract, with an explicit legacy list for older families. Non-Claude Anthropic-Messages models (minimax, qwen3, …) keep the manual path. - anthropic_adapter: _supports_adaptive_thinking / _supports_xhigh_effort / _forbids_sampling_params now default unknown Claude models to modern; legacy families enumerated in _LEGACY_MANUAL_THINKING_CLAUDE_SUBSTRINGS. - openrouter profile: omit reasoning entirely (→ adaptive default) instead of forwarding {enabled:false} for reasoning-mandatory Anthropic models; legacy Anthropic + all non-Anthropic models still pass the disable form through. - model_metadata + output-limit table: register claude-fable-5 (1M ctx, 128K out). Tests assert the invariant ("unknown Claude model -> modern contract; legacy stays manual; non-Claude unaffected"), not specific model names.	2026-06-09 23:37:23 +05:30
Frowte3k	39b76d9013	fix(packaging): ship optional-mcps catalog in wheel and sdist (#39859 ) The shipped MCP catalog (optional-mcps/) wasn't packaged, so `hermes mcp catalog` and the dashboard catalog screen come up empty on pip/Homebrew/Nix installs even though the manifests exist in the repo. The runtime expects a packaged catalog (get_optional_mcps_dir() -> _get_packaged_data_dir("optional-mcps"); list_catalog() returns [] when it's absent). Ship it like locales: pyproject [tool.setuptools.data-files] for the wheel + a MANIFEST.in graft for the sdist. optional-mcps/ is nested (optional-mcps/<name>/manifest.yaml) and data-files flattens each glob into its target dir, so each catalog entry gets its own target to preserve the per-entry directory the catalog iterates over.	2026-06-09 14:03:20 -04:00
Teknium	967c325da8	fix(models): read OpenRouter live context_length before hardcoded catch-all (#42986 ) OpenRouter-routed slugs that are absent from models.dev (e.g. a freshly shipped anthropic/claude-fable-5) fell through to the generic DEFAULT_CONTEXT_LENGTHS["claude"]=200K entry and under-reported their real 1M window. The step-6 OpenRouter live-metadata fallback was gated on `not effective_provider`, but an OpenRouter selection sets effective_provider="openrouter" (inferred from the base URL), so that branch was dead code for every OR model. Add a dedicated step-5 OpenRouter branch that consults the live /models catalog (authoritative, refreshes as new slugs ship) before models.dev and the hardcoded family defaults — mirroring the existing Nous/Copilot/GMI branches. Keeps the Kimi-family 32k underreport guard. Per-model values are respected (claude-haiku-4.5 stays 200K), so it does not blanket-bump to 1M. Regression tests cover the fable-5 case, the genuinely-200k case, and the Kimi guard.	2026-06-09 10:49:32 -07:00
Teknium	f6f573ebaa	feat(plugins): install from a subdirectory within a repo (#42963 ) Support installing a plugin that lives in a subdirectory of a larger repo (docs/tests at root, plugin in a subdir) without forcing a dedicated single-plugin repo. Identifier syntax: owner/repo/path/to/plugin (shorthand + subpath) <url>.git/path/to/plugin (.git boundary on GitHub-style URLs) <url>#path/to/plugin (explicit fragment, any scheme) _resolve_git_url now returns (git_url, subdir); _install_plugin_core reads the manifest from and moves only the subdir, so root-level docs and tests no longer leak into ~/.hermes/plugins. _resolve_subdir_within guards against path traversal, missing dirs, and non-directories. Both the CLI (hermes plugins install) and the dashboard install endpoint inherit this for free since they share _install_plugin_core. Dashboard install hint + placeholder updated to advertise the subdir syntax. Co-authored-by: Austin Pickett <pickett.austin@gmail.com>	2026-06-09 13:42:51 -04:00
Gille	c6dc2fcd21	fix(desktop): release profile backends before delete (#42613 )	2026-06-09 10:52:02 -05:00
Philip D'Souza	92dfd70d6a	fix(photon): production hardening for the gRPC-native iMessage channel (#42732 ) * fix(photon): override transitive CVEs in the sidecar deps `npm audit` flagged 7 high-severity transitive CVEs (protobufjs code injection GHSA-66ff-xgx4-vchm + outdated @opentelemetry OTLP exporters) pulled in via spectrum-ts -> @photon-ai/otel. npm's suggested fix downgrades spectrum-ts to a version that targets the decommissioned spectrum host, so instead pin patched versions via `overrides` (protobufjs 8.6.1, @opentelemetry/* 0.218.0) without touching spectrum-ts. `npm audit` -> 0; spectrum-ts + provider still import. * fix(photon): harden the sidecar bridge + bound the dedup cache - constant-time sidecar control-token comparison (was `!==`, timing-attackable). - cap the control-channel request body (2 MiB) so a compromised local peer can't OOM the sidecar. - wrap the inbound gRPC stream consumer in a re-subscribe loop with capped exponential backoff + jitter — if the async iterator throws/ends it would otherwise stop inbound forever (the adapter dedupes any replay). - add an unhandledRejection handler so a stray rejection logs instead of killing the process. - dedup cache (adapter) was a true bounded LRU only for expired entries; a burst of unique ids within the window grew it without limit. Evict oldest at the cap. * chore: add AUTHOR_MAP entry for PhilipAD --------- Co-authored-by: PhilipAD <philipadsouza@gmail.com>	2026-06-09 11:12:58 -04:00
Brian D. Evans	b5421f4ba6	fix(deps): declare packaging as a core dependency so it ships everywhere (#40522 ) * fix(deps): declare packaging as a core dependency so it ships everywhere packaging is imported directly on three production paths but was never declared in [project.dependencies], so it only reached users transitively (pip/uv pull it for other tools). The slim official Docker image ships without it, where each try/except-ImportError fallback silently degrades: - plugins/memory/hindsight/__init__.py (_meets_minimum_version) returns False when packaging is absent, disabling update_mode='append' so every session leaks separate Hindsight documents (the reported #40503 symptom). - tools/lazy_deps.py (_is_satisfied) falls back to "installed counts as satisfied", defeating every version-constraint check on lazy extras. - hermes_cli/main.py drops to naive name==version requirement parsing. Promote it to a declared core dep pinned to packaging==26.0 — the exact version already resolved in uv.lock, so there is zero resolution churn (the lock change is two edge annotations marking it transitive->direct). It is a pure-Python py3-none-any wheel with no compiled extensions, safe to ship on every platform. Declaring it also wires it into the _verify_core_dependencies_installed() update-repair guard, which reinstalls missing [project.dependencies] on hermes update. Adds a hermetic tomllib-parse regression test that fails before the declaration and passes after. Fixes #40503 * test(deps): make packaging dep-name extraction PEP 508-robust Address Copilot review on #40522: the inline name-extraction only handled ==, >=, [ and ; and could mis-parse valid requirement strings using <=, ~=, !=, <, > or a direct reference (name @ url). Factor a _distribution_name helper that drops markers, direct-reference URLs and extras, then strips any version operator via regex, so a future dep declared with any PEP 508 specifier shape is matched correctly. --------- Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>	2026-06-09 11:11:48 -04:00
xxxigm	57775e9e16	test(agent): cover char-based output-cap overflow parsing (#42741 ) Add TestParseCharBasedOutputCap for the LM Studio / llama.cpp phrasing (context in tokens, prompt in characters): the reported error resolves to the available output budget, the retried cap plus the estimated input stays inside the window, and a prompt larger than the window falls through to None so the prompt-too-long/compression path still owns that case.	2026-06-09 03:17:12 -07:00
teknium1	24a934295f	test(yuanbao): add missing patch import to pipeline tests The salvaged refactor's new tests use unittest.mock.patch (25 call sites) but the import line only brought in AsyncMock and MagicMock, so 10 of the new tests failed with NameError. Add patch to the import.	2026-06-09 03:17:00 -07:00
loongzhao	ffcd9d7ac7	refactor(yuanbao): consolidate media resolution into dedicated pipeline middlewares	2026-06-09 03:17:00 -07:00
JP Lew	cb4cc08b0a	fix(codex): record app-server token usage in session accounting	2026-06-09 02:46:04 -07:00
kshitij	85852b71d8	fix(nemo-relay): preserve downstream errors in adaptive execution (#42691 ) Based on #42658 by @mnajafian-nv. Preserves the real downstream provider/tool exception when NeMo Relay's managed adaptive execution wraps a failing callback as an internal runtime error. Without this, the original exception (and its retry-classification signal, e.g. status_code) is lost behind Relay's wrapper. Salvage changes on top of the original PR: - Tolerant Relay-wrapper match: _is_relay_wrapped_callback_error now uses str.startswith on the "internal error: <cls>: <msg>" prefix instead of exact equality, so a future Relay version appending a traceback/suffix doesn't silently defeat the unwrap. On a total format change it returns False and falls back to the pre-fix behavior (surfacing Relay's error) rather than masking it. - Deduplicated the LLM and tool execute paths into a shared _run_managed_with_downstream_preservation helper, removing ~20 lines of copy-pasted nonlocal/try-except scaffolding that could drift out of sync. - Added a real-middleware regression guard (test_nemo_relay_downstream_unwrap_matches_real_middleware_wrapper_shape) that drives hermes_cli.middleware._run_execution_chain and asserts the plugin's _original_downstream_error unwraps the actual private _DownstreamExecutionError wrapper. The original synthetic tests modeled the wrapper with a local class, so a rename or shape change in core middleware would not have been caught; this test fails loudly if that contract drifts. Co-authored-by: mnajafian-nv <mnajafian@nvidia.com>	2026-06-09 02:31:10 -07:00
Teknium	8d99b5bc4f	fix(gateway): cap terminal code-block preview in non-verbose mode (#42729 ) The markdown code-block change rendered args['command'] in full in both verbose AND non-verbose (all/new) modes, so a long or multi-line terminal command bypassed the tool_preview_length cap (default 40) and rendered as a huge block. Non-verbose now collapses to a single line capped at the preview length while keeping the fence; verbose keeps the full command.	2026-06-09 02:28:47 -07:00
kshitij	a38cc69bcc	fix(terminal): complete sane PATH entries on POSIX (salvage of #35614 ) (#42653 ) * fix(terminal): complete sane PATH entries on POSIX Fixes macOS gateway/launchd terminal sessions whose PATH already includes /usr/bin while omitting Apple Silicon Homebrew paths. LocalEnvironment._make_run_env() now appends each missing _SANE_PATH entry individually on POSIX, preserving caller precedence and avoiding duplicate sane entries. Root cause: the previous logic used /usr/bin as the sentinel for sane PATH injection. macOS launchd commonly provides /usr/bin while leaving out /opt/homebrew/bin and /opt/homebrew/sbin, so Homebrew-installed CLIs stayed unavailable in terminal tool calls. Salvaged from #35614 by @y0shua1ee. Fixes #35613. Co-authored-by: y0shua1ee <104712437+y0shua1ee@users.noreply.github.com> * test(terminal): harden sane PATH completion against dup/empty entries Follow-up to the #35613 fix. Strengthens _append_missing_sane_path_entries: - De-duplicate the caller-supplied PATH (first occurrence wins) so a PATH that already contains duplicate entries is collapsed rather than carried through. Previously only newly-appended sane entries were guarded against duplication; pre-existing caller duplicates were preserved verbatim. - Drop empty PATH entries (leading/trailing/double ':'), which POSIX shells interpret as the current working directory — a mild foot-gun in a default terminal environment. Behaviour for well-formed PATHs (no duplicates, no empty entries) is byte-identical to before; only malformed/duplicated inputs change. Adds regression tests for: the literal macOS launchd PATH (/usr/bin:/bin:/usr/sbin:/sbin), caller-duplicate collapsing with order preservation, and empty-entry stripping. * docs(terminal): clarify PATH normalisation semantics; drop dead set add Addresses review findings on the sane-PATH completion follow-up: - Sharpen the _append_missing_sane_path_entries docstring to state explicitly that on POSIX the caller PATH is rewritten (empty entries stripped, duplicates collapsed) rather than merely appended to, and that well-formed PATHs remain byte-identical bar the appended sane entries. This makes the intentional semantic change visible rather than buried under "hardening". - Document why _path_env_key is a deliberate second Windows guard distinct from the helper's early return (key-casing selection vs standalone safety), so neither is mistaken for redundant and removed. - Drop the dead `seen.add(entry)` in the sane-entry loop: _SANE_PATH is a static duplicate-free constant, so the membership check against the caller entries is sufficient and `seen` is never read afterwards. No behaviour change: verified byte-identical output across the launchd, minimal, empty, duplicate, empty-entry and already-full cases, and re-confirmed gh/brew resolve through the real LocalEnvironment.execute() path under a launchd-style PATH. 133 targeted tests pass. Intentionally NOT consolidating with tools/browser_tool._merge_browser_path: it prepends (vs append), filters on os.path.isdir, uses os.pathsep, and draws from a dynamic candidate set — a shared helper is a separate refactor, out of scope for this bugfix. --------- Co-authored-by: y0shua1ee <104712437+y0shua1ee@users.noreply.github.com>	2026-06-09 02:21:12 -07:00
kshitij	76f89d66de	fix(test): track TERMINAL_CONFIG_ENV_MAP after env-sync consolidation (#42695 ) `test_terminal_config_env_sync.py::_save_config_env_sync_keys()` AST-scanned `hermes_cli/config.py:set_config_value` for a `_config_to_env_sync = {...}` literal. The terminal-config env bridging was consolidated onto the canonical `TERMINAL_CONFIG_ENV_MAP` (now read via `terminal_config_env_var_for_key()`), so that literal no longer exists and the scanner raised: AssertionError: Could not find `_config_to_env_sync = {...}` literal in source failing 8 of 9 tests on main for every PR. Read the live `TERMINAL_CONFIG_ENV_MAP` instead — the actual source of truth `set_config_value` bridges through — mirroring its `terminal.cwd` exclusion. Refresh the stale module docstring and the now-incorrect error-message hints that still referenced `_config_to_env_sync`. Verified: the suite goes green, and a mutation (dropping `docker_volumes` from `TERMINAL_CONFIG_ENV_MAP`) still trips the pinned regression test, so the drift guard retains its teeth.	2026-06-09 02:11:46 -07:00
helix4u	f8adefdebf	fix(tui): apply terminal backend config before launch Some checks failed Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details Build Skills Index / build-index (push) Has been cancelled Details Build Skills Index / trigger-deploy (push) Has been cancelled Details	2026-06-09 00:31:27 -07:00
teknium1	dbbd1d4d05	feat(desktop+gateway): remote-gateway file attachments via file.attach @file: attachments now work when the desktop is connected to a remote gateway. Previously a referenced file resolved to a client-disk path the gateway couldn't see, so context_references rejected it with "path is outside the allowed workspace" and the agent never saw the file. Adds a file.attach RPC (sibling to the existing image.attach_bytes / pdf.attach byte-upload pipeline): the desktop uploads the file bytes, the gateway stages them into <workspace>/.hermes/desktop-attachments/ and returns a workspace-relative @file: ref that resolves cleanly. Local mode passes the path directly; a gateway-visible file outside the workspace is copied in; an in-workspace file is referenced as-is with no copy. Consolidates the file-sync design from #38615 (LeonSGP43) and the host-file-staging idea from #33455 (Carry00), rebased onto the image/PDF remote-media helpers already on main. Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>	2026-06-09 00:03:49 -07:00
Teknium	50ad191a8b	test(hermes_cli): harden concurrent-gate fixture against partial-import race (#42626 ) The autouse _suppress_concurrent_hermes_gate fixture did monkeypatch.setattr(main, '_detect_concurrent_hermes_instances', ...) with no raising=False. Its try/except guards the import but not the setattr, so under pytest's per-test spawn isolation a transiently partial hermes_cli.main module (one a concurrent worker is mid-importing) made setattr raise AttributeError and errored unrelated tests in the slice. Add raising=False so a transiently-absent attribute is a no-op default rather than a hard error. The attribute always exists once main.py finishes importing; the real-function opt-out (@pytest.mark.real_concurrent_gate) is unaffected.	2026-06-08 22:54:25 -07:00
teknium1	520b59db16	fix(tui): use canonical get_fallback_chain for parity + map author Follow-up to the salvaged fallback-chain fix: - Replace the hand-rolled fallback loader with the shared hermes_cli.fallback_config.get_fallback_chain() helper so the TUI path matches HermesCLI and gateway/run.py exactly: fallback_providers stays first and keeps order, with distinct legacy fallback_model entries merged in after (deduped). Previously the TUI loader picked one key OR the other, diverging from CLI/gateway when both were set. - Update the test to assert the merged canonical semantics. - Add psionic73 to scripts/release.py AUTHOR_MAP (CI gate).	2026-06-08 22:53:42 -07:00
psionic73	4b073d0906	fix(tui): preserve fallback provider chain	2026-06-08 22:53:42 -07:00
underthestars-zhy	dbf2470d46	feat(photon): Add voice message support to Photon adapter Extend the sidecar and Python adapter to handle `voice` content alongside `attachment`. Voice notes are inlined as base64 (same size-cap logic), surfaced as `MessageType.VOICE`, and include an optional `duration` field in fallback markers when bytes are unavailable.	2026-06-08 22:53:01 -07:00
underthestars-zhy	0337658904	fix(photon): migrate user API calls to Spectrum backend Switch `list_users`, `find_user_by_phone`, `create_user`, `register_user_if_absent`, and `refresh_user_numbers` from the Dashboard API (Bearer token) to the Spectrum API (Basic auth with project credentials). Update response unwrapping to handle the nested `data.users` envelope returned by Spectrum, add `_spectrum_host()` resolver, `_basic()` header helper, and structured error helpers. Update tests, docs, and plugin.yaml accordingly.	2026-06-08 22:53:01 -07:00
underthestars-zhy	b58ff93459	feat(photon): persist and display user phone numbers in status Store operator and assigned iMessage numbers in `auth.json` after setup, and surface them in `hermes photon status`. When numbers are missing, status auto-refreshes from the dashboard without provisioning new lines.	2026-06-08 22:53:01 -07:00
Teknium	9351cbafab	fix(gateway): auto-deliver image_generate output as native media (#42616 ) image_generate returns its artifact as JSON ({"image": "/abs/path.png"}) with no MEDIA: tag, so the gateway auto-append path (which only recognized text_to_speech MEDIA: tags) never delivered it — image delivery silently depended on the model restating the path in its reply. Add image_generate to the producer allowlist and extract the local path from its JSON result (host_image > image > agent_visible_image), reusing the existing extension-anchored matcher and history-dedupe so remote URLs, unknown extensions, failures, and already-sent paths are rejected. Closes the remaining unfixed path from #19105.	2026-06-08 22:51:03 -07:00
teknium	18ead88273	test: update docker preflight assertion for stdin=DEVNULL kwarg The blanket stdin=subprocess.DEVNULL pass added the kwarg to the docker 'version' preflight call; the test pinned the exact kwargs dict. Update the expected dict to match.	2026-06-08 22:46:57 -07:00
teknium	dba6380ca6	test: guard OAuth setup-token stays interactive + marker exemption Regression tests for the salvage follow-up: the interactive 'claude setup-token' login must keep inherited stdin, and the guard's inline 'noqa: subprocess-stdin' marker must exempt a call.	2026-06-08 22:46:57 -07:00
m4dni5	8bb60ff039	test: add pytest guard for subprocess stdin= in TUI-context code Wraps scripts/check_subprocess_stdin.py as a pytest so CI catches regressions when new subprocess calls are added without stdin=.	2026-06-08 22:46:57 -07:00
Teknium	3705625b74	feat(gateway): render terminal commands as bare fenced code blocks in chat (#42576 ) Terminal tool progress on markdown-capable gateways (Telegram, Slack, Discord, WhatsApp, Matrix, Weixin, Feishu) renders the full command in a fenced code block again, in all/new AND verbose modes — gated on the adapter's supports_code_blocks capability. Plain-text platforms keep the short truncated preview. No language tag is emitted: Slack mrkdwn renders a '```bash' fence with 'bash' as a literal first code line, so a bare '```' fence is used, which renders correctly on every platform that supports blocks. This restores the #41215 feature (removed in #41950 due to the command showing in group chats) as the default. For a personal assistant the command display is desired; the group-chat concern is a preference, not a vulnerability.	2026-06-08 21:19:05 -07:00
underthestars-zhy	3b983e7791	fix(photon): add home channel env seed and simplify space resolution	2026-06-08 21:03:58 -07:00
underthestars-zhy	0646656884	fix(photon): support E.164 and DM GUID targets for home channel Allow PHOTON_HOME_CHANNEL to accept a bare E.164 phone number or a `any;-;+1...` DM chat GUID in addition to a Spectrum space id. Inbound DM spaces are cached so replies resolve without a second SDK lookup, and `photon` is added to _PHONE_PLATFORMS so send_message treats E.164 strings as explicit targets rather than falling through to channel-name resolution.	2026-06-08 21:03:58 -07:00
underthestars-zhy	92179352fb	feat(photon): auto-configure allowlist and cron channel on setup During `hermes photon setup`, allowlist the operator's number and set their DM as the cron home channel when those env vars are unset. Without this, the gateway denies the operator's own messages and cron has no default delivery target. Re-runs never overwrite hand-tuned values. Also teaches the sidecar's `resolveSpace` to accept a bare E.164 number as a space identifier, resolving it to the user's DM space so `PHOTON_HOME_CHANNEL` can be set to a phone number instead of an opaque space id.	2026-06-08 21:03:58 -07:00
underthestars-zhy	84e4b4b9a5	fix(photon): use per-user assigned line for agent iMessage number On shared-number plans, `/lines` has no dedicated entry, so the `assignedPhoneNumber` field on the user object is the source of truth for which number to text the agent. Fall back to the line inventory only when no per-user assignment exists.	2026-06-08 21:03:58 -07:00
underthestars-zhy	314af28e86	feat(photon): download and inline inbound attachments	2026-06-08 21:03:58 -07:00
underthestars-zhy	4e4d27875f	feat(photon): gRPC-native iMessage channel (no webhook) Make Photon iMessage a first-class persistent-connection channel like Discord/Slack, using the spectrum-ts gRPC stream for both directions. - Inbound: the sidecar forwards the SDK's app.messages gRPC stream to the adapter over a loopback GET /inbound (NDJSON) instead of webhooks. Drops the aiohttp webhook server, HMAC signature verification, public URL, and PHOTON_WEBHOOK_* config; adapter reconnects with backoff. - Management plane: device login uses client_id=photon-cli against the single dashboard host (Bearer), matching the official photon-hq/cli; find-or-create "Hermes Agent" project, enable Spectrum, rotate secret, register user (with phone dedup), surface the assigned iMessage line. - SDK projectId is the project's spectrumProjectId, not the dashboard id; runtime creds persist to ~/.hermes/.env like every other channel. - CLI: 6-step setup, webhook subcommands removed. - Tests/docs updated for the gRPC flow; sidecar pins spectrum-ts ^1.17.1. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 21:03:58 -07:00
Juraj Bednar	0c2e81df00	feat(simplex): groups, native attachments, text batching, auto-accept Salvage of PR #27978 cherry-picked onto current main, resolving conflicts with main's intervening SimpleX plugin fixes (resp-envelope normalization, health-monitor reconnect-churn fix, bare-form DM addressing). What's new: - Group support via SIMPLEX_GROUP_ALLOWED (comma-separated IDs or '*'); inbound items surface chat_id=group:<id> + chat_type=group. Disabled by default so a bot in a group doesn't process every member's traffic. - Inbound files/voice via rcvFileDescrReady (immediate /freceive) deferred through _pending_file_transfers, replayed on rcvFileComplete. Voice notes -> MessageType.VOICE. - Native outbound media: send_image (PNG/JPEG + inline thumbnail), send_voice (msgContent.type=voice), send_video, send_document. All addressed by numeric ID via /_send ... json [...]. - MEDIA:<path> tags in agent replies stripped and dispatched as voice/document. - Text-burst batching (HERMES_SIMPLEX_TEXT_BATCH_DELAY, default 0.8s). - Auto-accept contact requests (SIMPLEX_AUTO_ACCEPT, default true). - Group send path uses structured /_send #<id> json form (the bracket #[<id>] form is parsed as display-name lookup and silently drops). plugin.yaml bumped to 1.1.0; docs updated. All inside plugins/platforms/simplex/ - no core edits. Co-authored-by: Juraj Bednar <juraj@bednar.io>	2026-06-08 21:03:45 -07:00

1 2 3 4 5 ...

5206 commits