hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-10 08:32:09 +00:00

Author	SHA1	Message	Date
Teknium	298bb93d39	feat(skills): show live per-source progress while browsing (#43398 ) Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker / shell lint / Lint Dockerfile (hadolint) (push) Waiting to run Details Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Waiting to run Details Nix Lockfile Fix / fix (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details do_browse waited on a frozen 'Fetching skills...' spinner while sources resolved, so a slow source looked like a hang. parallel_search_sources already exposes an on_source_done(sid, count) callback fired as each source completes — wire it into the status line so it ticks off sources live (official (12), + github (4), + clawhub (500)). The page is still rendered once, after the full set is merged and trust-sorted, so browse's official-first ordering and pagination contract are untouched.	2026-06-10 01:02:40 -07:00
Teknium	eee1da45f0	fix(skills): bound ClawHub catalog walk to requested page on cold start (#43395 ) Browse renders one page but the cold-cache fallback walked the entire 50k+ ClawHub catalog, then sliced off the first N — pure waste behind the 12s budget band-aid. _load_catalog_index now takes max_items: browse's empty-query path bounds the walk to its limit and stops early; the offline index builder still passes limit=0 (unbounded) and walks to exhaustion. A bounded walk is partial, so it is not written to the shared full-catalog cache (same poison-guard as the budget-truncated case).	2026-06-10 01:01:53 -07:00
konsisumer	6a30cfca82	fix(gateway): stop typing before post-delivery callbacks (#37556 )	2026-06-10 00:46:00 -07:00
tomekpanek	383d44bc9a	fix(web): rank explicit credentials above managed-gateway probe Backend selection ordered firecrawl (including the Nous-managed-tool-gateway probe) ahead of explicit-credential backends, so a user who had both a Nous OAuth token AND a TAVILY_API_KEY (or EXA/PARALLEL key) got firecrawl auto-selected — then the request failed at runtime because the free Nous tier does not include web search, and there is no fallback to the next available backend. Explicit user setup lost to a managed convenience. Reorder so direct-credential backends (tavily > exa > parallel > firecrawl- direct) are tried first, then the managed-gateway firecrawl probe, then free-tier fallbacks. Behaviour for users with only Nous OAuth (no explicit key) is unchanged — firecrawl-via-gateway is still selected. Behaviour change to flag: a user with BOTH a Nous OAuth token AND a TAVILY_API_KEY (or EXA/PARALLEL key) now gets the explicit backend instead of the managed gateway. This matches the principle of least surprise — a user does not set TAVILY_API_KEY without intent — and sidesteps the silent runtime failure of the gateway path on free tiers.	2026-06-10 00:34:38 -07:00
Teknium	243cada157	fix(model): cover typed gateway /model path + async-safe pricing lookups Follow-ups on top of #26016's expensive-model guard: - gateway/slash_commands.py: typed '/model <name>' now routes through the expensive-model confirmation gate (slash-confirm buttons / text fallback) instead of bypassing the guard the pickers enforce. Cancel leaves the session override and --global config untouched. - telegram/discord/web_server: run expensive_model_warning() via asyncio.to_thread — it can hit models.dev or a /models endpoint on a cache miss, which would otherwise block the event loop. - telegram: picker callback no longer toasts 'Model switched!' when the switch callback raised (both mm: and mc: paths). - tests: new tests/gateway/test_model_command_expensive_confirm.py pins the typed-path gate (prompt, confirm-once, cancel, cheap-model no-op).	2026-06-10 00:24:06 -07:00
Robin Fernandes	af978ecb17	fix(model): require confirmation for expensive model selections Rebased onto current main and re-ported across the restructured surfaces: model flows now thread confirm_provider/base_url/api_key through hermes_cli/model_setup_flows.py, the Discord picker lives in plugins/platforms/discord/adapter.py, and the web dashboard picker applies chat-mode switches via config.set so the expensive-model confirmation can ride the response. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 00:24:06 -07:00
xxxigm	aea0b7397b	test(discord): cover voice timeout under voice-off mode Assert the inactivity handler skips disconnect (and the channel spam) when the voice-mode getter reports "off", and still disconnects on genuine inactivity when the mode is active.	2026-06-09 23:24:26 -07:00
briandevans	105625d650	fix(skills): honour overall_timeout and bound ClawHub catalog walk parallel_search_sources accepted an overall_timeout but never honoured it. The ThreadPoolExecutor ran inside a `with ... as pool` block, whose __exit__ calls shutdown(wait=True); even after as_completed() raised TimeoutError on schedule, leaving the block blocked the caller until every worker finished. A single slow source (e.g. ClawHub) therefore stalled the entire browse for minutes. Manage the executor manually and shut it down with wait=False, cancel_futures=True in a finally, so the timeout actually returns and not-yet-started work is dropped. ClawHubSource._load_catalog_index walked up to 750 sequential pages with no wall-clock bound (each request under its own timeout=30, so nothing errored), and wrote the result to the index cache unconditionally — so an interrupted or slow walk poisoned the cache with a partial catalog. Add a CATALOG_WALK_BUDGET_SECONDS deadline that breaks the walk early, and only write the cache when the walk reaches a natural stop (cursor exhausted or page cap), never on a budget-truncated walk. Adds regression tests covering both bugs (timeout honoured + slow source flagged; budget abort does not poison cache) plus their happy-path invariants.	2026-06-09 23:22:54 -07:00
teknium	2ce3ae3d16	fix(error-classifier): don't misclassify unsupported-param 400s as context overflow A GPT-5 model rejecting max_tokens returns a 400 whose message contains the literal substring 'max_tokens' — one of the _CONTEXT_OVERFLOW_PATTERNS. The 400 path in _classify_400 checked overflow patterns before any request-validation check (which only existed on the 5xx path), so the parameter error was routed into the compression loop, re-sent with the same bad param, and ended in 'Cannot compress further' on a tiny context. Hoist a request-validation guard (unsupported/unknown parameter) above the context-overflow check in _classify_400. Deliberately excludes the generic invalid_request_error code, which OpenAI also stamps on real overflow 400s, so genuine overflows still compress. Pairs with the max_completion_tokens param fix that stops the bad request at the source. Also adds AUTHOR_MAP entry for the salvaged PR #13902 commit.	2026-06-09 23:22:10 -07:00
Xiangji	19c07c4037	fix(params): send max_completion_tokens for newer OpenAI families on custom endpoints Third-party OpenAI-compatible endpoints (self-hosted gateways, OpenRouter, Azure proxies) fronting gpt-4o / gpt-4.1 / gpt-5+ / o1-o4 models silently received max_tokens and 400'd with unsupported_parameter, because the three kwarg-selection sites only checked base_url_hostname(...) == "api.openai.com" and fell through to max_tokens on every other host. The constraint is enforced server-side by the model family, not by the URL, so name-based detection is required as a fallback. Changes: - utils.py: new shared helper model_forces_max_completion_tokens(model) that prefix-matches gpt-4o, gpt-4.1, gpt-5, o1, o3, o4 families on normalized (lowercased, vendor-prefix-stripped) names. - run_agent.py: _max_tokens_param ORs the helper into the URL check. - agent/auxiliary_client.py: - auxiliary_max_tokens_param gains an optional keyword-only model arg. - _build_call_kwargs inline branch applies the same check for both provider == "custom" and non-custom paths. Tests: - tests/test_model_forces_max_completion_tokens.py: 31 new cases covering positive families, negatives (classic gpt-4, claude, llama, mistral, qwen, deepseek), vendor prefixes, case-insensitivity, whitespace, None/empty, and substring-not-prefix guards. - tests/run_agent/test_run_agent.py::TestMaxTokensParam: 5 new model-based cases (custom + gpt-5.4, openrouter + gpt-4o-mini, custom + o1-preview, classic gpt-4-turbo keeps max_tokens, llama3 keeps max_tokens). - tests/agent/test_auxiliary_client.py::TestAuxiliaryMaxTokensParam: new class, 7 tests covering the URL x model matrix.	2026-06-09 23:22:10 -07:00
Ondrej Drapalik	1c055a4c58	fix(xai): accept Grok Build code during loopback wait + tiny screenshot guard xAI's consent page renders the authorization code in-page instead of redirecting to the loopback callback, so the listener just hangs and the manual-paste flow demands a callback URL that never contains the token. - auth.py: poll stdin non-blockingly while waiting for the xAI loopback callback; accept a pasted bare Grok Build code and substitute the locally generated state (PKCE code_verifier still binds the exchange). No need to wait for timeout or re-run with --manual-paste. - computer_use: parse PNG/JPEG dimensions from base64 and fall back to the text/AX/SOM payload when the screenshot is below the provider minimum (8x8), which xAI rejects with HTTP 400. - model_setup_flows.py: xAI credential reuse prompt uses the standard radio picker via a shared _prompt_auth_credentials_choice helper. - main.py: thread a title through _prompt_provider_choice; re-home the helper import (flows live in model_setup_flows.py post-decomposition). Salvaged from #36781 onto current main (contributor's main.py edits re-homed to model_setup_flows.py, where the flows were extracted since the PR opened).	2026-06-09 23:21:24 -07:00
Teknium	095f526b11	refactor(memory,skills): replace tri-state write_mode with boolean write_approval (default off) (#43354 ) The shipped tri-state write_mode (on\|off\|approve) conflated two concepts — whether writes are enabled and whether they're gated — so 'on' (writes flow freely, gate inactive) read like 'gating is on'. Replace it with a single clear boolean gate that defaults off. memory.write_approval / skills.write_approval: false (default) — write freely; the approval gate is off (pre-gate behaviour) true — require approval: memory foreground prompts inline, memory background-review + all skill writes stage for review The old 'off = block all writes' mode is dropped; memory_enabled: false already disables memory entirely, so a third 'block' state was redundant. - tools/write_approval.py: get_write_mode/MODE_* → write_approval_enabled() bool; evaluate_gate() loses the config-driven 'blocked' path (blocked now only comes from an interactive user denial). - tools/memory_tool.py, tools/skill_manager_tool.py: comment + behaviour follow. - hermes_cli/config.py: memory/skills write_mode → write_approval (False); _config_version 28→29 with a 28→29 migration that renames any persisted write_mode (approve→true, on/off/unset→false) and drops the old key. - slash commands: '/memory\|/skills mode <on\|off\|approve>' → 'approval <on\|off>' ('mode' kept as a back-compat alias); set_mode_fn callback now takes a bool. - write_approval_commands.py, cli_commands_mixin.py, gateway/slash_commands.py, commands.py: handlers + registry args/subcommands updated. - docs + tests rewritten for the boolean model; added migration tests.	2026-06-09 23:21:14 -07:00
synapsesx	9ca9697342	fix(gateway): return tuple from voice transcription on placeholder caption (#42090 ) ## What does this PR do? The voice-during-active-run feature (#41984) changed `_enrich_message_with_transcription` so that it returns a `(enriched_text, successful_transcripts)` tuple instead of a bare string, which lets callers echo the raw transcript back to the user. The signature and every other return path were updated to match, but one branch was missed: when a successfully transcribed clip arrives with the Discord "empty content" placeholder as its caption, the method still returned the prefix string on its own. All four call sites unpack the result with `text, transcripts = await self._enrich_message_with_transcription(...)`, so that path raised `ValueError: too many values to unpack (expected 2)` and the inbound voice message was dropped instead of reaching the agent. This is a real user-facing path rather than a corner case: a Discord voice note sent without a caption is delivered as exactly that placeholder, so a captionless voice message that transcribed correctly would crash the handler precisely when transcription had worked. The fix returns the proper tuple from that branch so the placeholder is still stripped while the transcripts continue to flow back to the caller for the echo. ## Related Issue N/A ## Type of Change - [x] 🐛 Bug fix (non-breaking change that fixes an issue) - [ ] ✨ New feature (non-breaking change that adds functionality) - [ ] 🔒 Security fix - [ ] 📝 Documentation update - [ ] ✅ Tests (adding or improving test coverage) - [ ] ♻️ Refactor (no behavior change) - [ ] 🎯 New skill (bundled or hub) ## Changes Made - `gateway/run.py`: in `_enrich_message_with_transcription`, return `(prefix, successful_transcripts)` instead of a bare `prefix` from the empty-content-placeholder branch, so the contract matches the signature and the other return paths. - `tests/gateway/test_stt_config.py`: add `test_enrich_message_with_transcription_returns_tuple_for_empty_content_placeholder`, which drives a successful transcription with the placeholder caption and asserts the placeholder is stripped while the transcript is still returned. ## How to Test 1. Check out `main` and run the new test — it fails with `ValueError: too many values to unpack (expected 2)`, reproducing the crash a captionless Discord voice note would trigger. 2. Apply this change and re-run `pytest tests/gateway/test_stt_config.py -q` — all tests pass. 3. `ruff check gateway/run.py tests/gateway/test_stt_config.py` and `python scripts/check-windows-footguns.py gateway/run.py tests/gateway/test_stt_config.py` both pass. ## Checklist ### Code - [x] I've read the [Contributing Guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md) - [x] My commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) (`fix(scope):`, `feat(scope):`, etc.) - [x] I searched for [existing PRs](https://github.com/NousResearch/hermes-agent/pulls) to make sure this isn't a duplicate - [x] My PR contains only changes related to this fix/feature (no unrelated commits) - [x] I've run `pytest tests/ -q` and all tests pass - [x] I've added tests for my changes (required for bug fixes, strongly encouraged for features) - [x] I've tested on my platform: macOS 15 (Darwin 25.5) ### Documentation & Housekeeping - [x] I've updated relevant documentation (README, `docs/`, docstrings) — or N/A - [x] I've updated `cli-config.yaml.example` if I added/changed config keys — or N/A - [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — or N/A - [x] I've considered cross-platform impact (Windows, macOS) per the [compatibility guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md#cross-platform-compatibility) — or N/A - [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A	2026-06-09 23:16:23 -07:00
Ben Barclay	63a421d4c0	fix(dashboard): _require_token endpoints all 401 behind the OAuth gate (#42578 ) * fix(dashboard): let _require_token endpoints work behind the OAuth gate In gated/OAuth mode (non-loopback bind without --insecure) the dashboard authenticates the SPA via a session cookie and deliberately does NOT inject the legacy ephemeral _SESSION_TOKEN into index.html. gated_auth_middleware verifies the cookie and attaches request.state.session before any non-public /api/ route runs; the legacy auth_middleware short-circuits in this mode too. But several handlers call _require_token() directly, which only validated the (absent) _SESSION_TOKEN header. So every cookie-authenticated request to those endpoints 401'd — making plugin install/enable/disable, /api/dashboard/plugins/hub, and the other _require_token routes permanently unreachable behind the gate. In the UI this surfaced as a 401: {"detail":"Unauthorized"} popup on plugin install for any publicly-bound (e.g. Fly-hosted NAS) dashboard. Fix: _require_token now defers to the active gate. When auth_required is True it accepts the request iff the gate attached a verified session (and 401s otherwise); loopback/--insecure behavior is unchanged (still validates the session token). Adds two regression tests driving the full in-process stub OAuth round trip: the install endpoint must NOT 401 a logged-in request, and must still 401 with no cookie. Verified the accept-test fails on the pre-fix code. * test(dashboard): cover the whole _require_token route class under the gate The install popup was one symptom of a class-wide bug: all 14 endpoints that call _require_token directly (API-key reveal, provider validation, the OAuth-provider connect/disconnect flow, and plugin enable/disable/update/ delete/visibility/providers) 401'd cookie-authenticated requests in gated mode. Add a parametrized test hitting a representative spread (plugins/hub, env/reveal, providers/validate, an oauth provider route, agent-plugin enable) asserting a logged-in caller is never 401'd — proving the fix covers the class, not just agent-plugins/install.	2026-06-09 22:57:49 -07:00
Ben Barclay	e4a1b35a39	fix(config): preserve original .env file mode instead of unconditionally tightening to 0600 (#33699 ) `save_env_value()` captures the original .env file mode (e.g. 0640 for Docker volume mounts) and restores it via `os.chmod` — but then unconditionally calls `_secure_file(env_path)` on the next line, which re-tightens the mode to 0600 and defeats the entire preservation logic. The intent (preserve when `original_mode` is captured, secure otherwise) was already in the code but got short-circuited. Move `_secure_file()` into the `else` branch so it only runs when no original mode was captured — fresh `.env` files written for the first time still get the 0600 hardening treatment, but operator-set modes survive subsequent writes. Salvages #31518 by @blut-agent (config.py portion only). Their PR also bundled unrelated lowercase-lookup changes in `hermes_cli/commands.py`; this salvage takes only the focused config fix. The commands.py changes are reasonable on their own merits but belong in a separate PR. Co-authored-by: blut-agent <278569635+blut-agent@users.noreply.github.com>	2026-06-10 15:42:16 +10:00
kshitij	f1b8519670	Merge pull request #43322 from kshitijk4poor/fix/langfuse-redact-base64-data-uri fix(langfuse): redact base64 data URIs instead of truncating into invalid base64	2026-06-09 22:41:41 -07:00
mnajafian-nv	f8fd30942c	fix(cli): prevent duplicate one-shot finalize on interrupted cleanup (#43320 ) Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>	2026-06-09 22:41:04 -07:00
LeonSGP	702f4df194	Repair cron ownership on container restart (#41976 )	2026-06-10 15:32:34 +10:00
kshitijk4poor	9caa12f4ec	fix(skills): resolve skill_view by frontmatter name when dir name differs skills_list() surfaces each skill's frontmatter `name:`, but skill_view() only matched on the on-disk directory name (Strategy 2). When a skill's directory is a shorter category/alias that differs from its frontmatter name, skill_view(name) failed to find it. Extend the recursive Strategy-2 walk to also match frontmatter `name:`, guarded by a try/except so an unreadable/malformed SKILL.md can't break discovery. Adds a regression test that creates a skill whose directory name differs from its frontmatter name and asserts skill_view resolves it (fails on current main, passes with this change). Salvaged the skill_view fix from #39682 onto current main as a standalone, single-concern change with the test the original PR lacked. Co-authored-by: foras910521-lab <foras910521-lab@users.noreply.github.com>	2026-06-10 10:51:45 +05:30
kshitijk4poor	4642762289	fix(langfuse): redact base64 data URIs instead of truncating into invalid base64 The Langfuse SDK treats `data:;base64,...` strings as media and tries to decode them. `_truncate_text` was slicing those strings mid-payload, producing invalid base64 and noisy "Error parsing base64 data URI" logs. Observability only needs the metadata, not raw image/audio bytes, so redact the whole data URI (type, media_type, length) before it reaches the SDK. Salvaged the Langfuse fix from #39682 onto current main as a standalone, single-concern change (the dashboard `dist/*` and plugin-discovery parts of that PR already landed separately on main). Co-authored-by: foras910521-lab <foras910521-lab@users.noreply.github.com>	2026-06-10 10:49:36 +05:30
mnajafian-nv	d03cdd63eb	fix(cli): run one-shot query cleanup before lease release (#43036 ) * fix(cli): run one-shot query cleanup before lease release Signed-off-by: mnajafian-nv <mnajafian@nvidia.com> * test(cli): cover quiet one-shot cleanup finalization Signed-off-by: mnajafian-nv <mnajafian@nvidia.com> --------- Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>	2026-06-09 21:52:13 -07:00
Teknium	96af61b6ef	feat(memory,skills): approve/deny gate for memory + skill writes (#38199 ) Adds memory.write_mode and skills.write_mode (on\|off\|approve), applied to both foreground turns and the background self-improvement review fork — the source of the unprompted 'wrong assumption' saves users reported. - on (default): write freely, unchanged behaviour - off: never write; the tool returns a clean disabled result - approve: don't commit. Memory foreground writes prompt inline (small, reviewable in a chat bubble); background memory writes and ALL skill writes stage to a pending store instead (a SKILL.md is too large to review inline, and a daemon thread can't block on a prompt) Review staged writes from CLI or any messaging platform: /memory pending\|approve\|reject\|mode /skills pending\|approve\|reject\|diff\|mode Skill review respects the size asymmetry: inline you see a one-line gist; the full unified diff stays out-of-band (/skills diff, dashboard, or the staged JSON file). New: tools/write_approval.py (gate + pending store), hermes_cli/ write_approval_commands.py (shared CLI+gateway handlers). Gates wired at the single entry points memory_tool() and skill_manage(), using the existing write-origin ContextVar to distinguish foreground from background_review.	2026-06-09 21:51:43 -07:00
Teknium	f082b4ec5c	fix(ci): make parallel runner's exit-4 retry robust for newly-added test files (#42994 ) The per-file test runner re-runs a file once when pytest exits 4 ("file or directory not found") while the file exists on disk — a transient seen on loaded shared CI runners where the planner collects a file (--collect-only counts its tests) but the per-file subprocess fails to stat it moments later. A single immediate retry could land in the same brief high-load window and fail again, and the retry was gated on one Path.exists() check that can itself be a flaky stat under that load — so a freshly-added test file that LPT pins to one shard would deterministically red that shard on every run (no actual test failure; the file just never executes). - Extract the subprocess spawn/communicate/process-tree-kill logic into a shared _spawn_pytest_once() helper (removes ~90 lines of duplication between the primary run and the retry). - Replace the single-shot retry with a bounded backoff loop (_EXIT4_RETRY_ATTEMPTS, escalating sleep) that re-runs while the file is present on disk. - Add _file_present() which re-checks existence across a few spaced stats, so a single flaky negative stat doesn't wrongly conclude the file is missing. A genuinely-missing file (typo/deleted) still fails fast — exit 4 is not swallowed when the file truly does not exist. - Tests: transient-then-pass recovery, genuinely-missing fails fast with no retry, give-up after max attempts, and _file_present transient/missing cases.	2026-06-09 21:39:09 -07:00
Ben Barclay	5cf6e28a2f	fix(gateway): auto-start after container restart via planned-stop marker (#42675 ) (#43236 ) * fix(gateway): auto-start after container restart via planned-stop marker On Docker (s6-overlay), the gateway runs as a dynamically-registered s6 service. When the container stops/restarts/upgrades, s6 sends the gateway a plain SIGTERM. The shutdown path (_stop_impl) ended with an unconditional _update_runtime_status("stopped"), persisting gateway_state=stopped to the volume. container_boot.py reads that on the next boot and only auto-starts gateways whose last state was "running" (_AUTOSTART_STATES) — so after a routine `docker compose up --force-recreate` the gateway stays down and messaging channels silently go dark, with no error surfaced (issue #42675). The codebase already distinguishes intentional stops from unexpected signals via the planned-stop marker (write_planned_stop_marker / consume_planned_stop_marker_for_self): `hermes gateway stop`, systemd/launchd ExecStop, and Ctrl+C write a marker before signalling, so the handler classifies them as planned. An unmarked SIGTERM (container/s6 restart, OOM, bare kill) is signal-initiated. This wires that existing classification through to the state persist, rather than adding unreliable signal-source inference: - run.py: GatewayRunner._signal_initiated_shutdown, set in shutdown_signal_handler's unmarked-signal branch. In _stop_impl, a signal-initiated (non-restart) teardown now persists "running" instead of "stopped" — preserving the operator's run-intent and overwriting the mid-shutdown "draining" marker so _AUTOSTART_STATES matches on reboot. Operator stops and restarts persist "stopped" as before. - service_manager.py: S6ServiceManager.stop() now writes the planned-stop marker for the supervised PID (read from s6-svstat) before `s6-svc -d`, so an in-container `hermes gateway stop` is correctly classified as intentional (parity with the systemd/launchd/host stop paths, which already mark). Best-effort: a marker-write failure falls back to the safe signal-initiated path. Tests: shutdown persist-decision table (signal→running, operator→stopped, restart→stopped), s6 stop marker write + svstat PID parse + failure tolerance. The signal→running and s6-marker tests fail without the respective source change. Verified end-to-end against a container built from this branch: an unmarked SIGTERM to the live gateway leaves gateway_state=running (shutdown-context log confirms signal path); existing real container-restart suite still green. * docs(docker): clarify gateway autostart distinguishes operator-stop from container-kill The per-profile-supervision section described the autostart-across-restart contract as "running gateways come back, stopped stay stopped" without spelling out what records 'stopped'. That contract was the source of #42675 confusion: users expected a restart to bring the gateway back and it didn't. With the write-side fix, only an explicit `hermes gateway stop` records 'stopped'; container/s6 restart SIGTERMs (incl. image upgrades and unexpected exits) leave the state 'running' so the gateway auto-starts. Make that distinction explicit in both the multi-profile and per-profile-supervision sections. * test(docker): real-restart autostart E2E for #42675 Adds test_live_gateway_autostarts_after_real_restart_without_manual_state_stamp: a live s6-supervised gateway is killed by an actual `docker restart` SIGTERM (no manual gateway_state stamp, no planned-stop marker) and must auto-start on the next boot. Exercises the WRITE side of the fix that the existing stamp-based tests bypass. Verified to FAIL against an origin/main image (reconciler logs prior_state=stopped action=registered — the #42675 bug) and PASS against the fixed image (prior_state=running action=started).	2026-06-10 14:01:34 +10:00
Siddharth Balyan	b4170f3ac2	fix(cron): don't strict-scan script-injected output in no-skills jobs (#43223 ) The runtime assembled-prompt scan (#3968 lineage) selected its pattern tier on has_skills alone. A script-driven, no-skills job injects its script's stdout into the prompt, and that blob was scanned with the STRICT user-prompt pattern set — so any command-shape string in the data feed (e.g. a triage bot ingesting a bug report that quotes `rm -rf /`) hard-blocked the job on every tick. Script output and context_from output are runtime DATA produced by operator-authored code — the same trust class as install-vetted skill markdown, not a user-authored directive prompt. Select the scan tier by what the assembled prompt CONTAINS: when it includes skill content OR injected data, use the looser _scan_cron_skill_assembled set (keeps unambiguous injection directives, drops command-shape patterns, sanitizes invisible unicode instead of blocking). Defense-in-depth is preserved: - The raw user prompt is still strict-scanned at create/update (api_server paths untouched) AND re-scanned strict at runtime even when the looser tier was selected for the data blob. - Plain no-script/no-skills jobs keep the strict scan on the whole assembled prompt. - Injection directives arriving via script stdout still block. Rejected alternative: removing destructive_root_rm from the strict set or a per-job skip_injection_scan flag — both weaken the guard globally.	2026-06-10 08:27:24 +05:30
Ben Barclay	7df3aa34b1	fix(dashboard-auth): warn when public_url override is silently rejected (#43214 ) A non-empty HERMES_DASHBOARD_PUBLIC_URL / dashboard.public_url value that fails URL validation (overwhelmingly: a missing http(s):// scheme, e.g. "hermes.domain.com") was silently discarded by resolve_public_url(), falling back to reconstructing the OAuth redirect_uri from request headers. Behind a reverse proxy that doesn't forward X-Forwarded-Proto reliably, that yields an http:// callback even though the operator explicitly set the public URL — with no signal as to why (#42780). Emit a deduplicated operator-facing WARNING (once per distinct value, since resolve_public_url runs per request) naming the offending value and the required scheme. Turns a silent footgun into a self-diagnosing one; behaviour is otherwise unchanged. Tests assert the warning fires for a scheme-less value, is deduplicated across repeated calls, and stays silent for a valid value — all three fail without the fix.	2026-06-10 12:14:57 +10:00
BROCCOLO1D	29036155ce	fix(terminal): lazy-parse docker env config (#42733 ) Co-authored-by: BROCCOLO1D <279959838+BROCCOLO1D@users.noreply.github.com>	2026-06-10 11:04:27 +10:00
xxxigm	93340fa3c1	fix(tui_gateway): honor target profile's terminal.cwd on desktop profile switch (#40892 ) * fix(tui_gateway): honor target profile's terminal.cwd on desktop profile switch The desktop's app-global remote mode serves every profile from one tui_gateway backend, so the process-global TERMINAL_CWD only reflects the launch profile. After switching profiles, a new session resolved its workspace from that stale env var and inherited the previous profile's directory. Add _profile_configured_cwd() to read a non-launch profile's own terminal.cwd from its config.yaml (skipping placeholder/empty/missing and non-existent paths so callers fall back cleanly), and wire it into _completion_cwd() with precedence: explicit client cwd -> existing session cwd -> bound profile's configured cwd -> TERMINAL_CWD -> os.getcwd(). Fixes #40334 * test(tui_gateway): cover per-profile cwd resolution (#40334) Pin the new contract: _profile_configured_cwd reads a profile's own terminal.cwd and rejects placeholders/missing paths, and _completion_cwd prefers a bound profile's cwd over a stale launch-profile TERMINAL_CWD while still letting an explicit client cwd win.	2026-06-09 19:45:29 -05:00
brooklyn!	aecdacb11b	Merge pull request #43109 from NousResearch/fix/desktop-remote-attach-drops fix(desktop): stage dropped files into the remote session workspace	2026-06-09 19:22:11 -05:00
Brooklyn Nicholson	7ffc216bc0	fix(agent): make a binary @file: reference actionable instead of a dead end A binary @file: ref (PDF, docx, spreadsheet, …) expanded to a bare "binary files are not supported" warning with no content. The model saw a failure and gave up — e.g. a dropped PDF came back as a text note claiming the type was unsupported, even though the file was staged on disk right next to it. Inject an actionable content block instead: the path, mime type, size, and a nudge to use its tools to read/convert/view the file (and explicitly not to tell the user the type is unsupported). General across every binary type — not PDF-specific. The file already resolves where the agent's tools run (local cwd or the staged copy in a remote session workspace), so it can act on it directly.	2026-06-09 19:16:46 -05:00
brooklyn!	218452b050	fix(state.db): recover from malformed sqlite_master so hidden sessions reappear (#43149 ) * fix(state.db): recover from malformed sqlite_master so hidden sessions reappear The corruption class behind "Desktop/Dashboard show no sessions while hundreds of session files sit on disk" is a malformed sqlite_master — most often a duplicate object row, e.g. two CREATE VIRTUAL TABLE messages_fts entries — surfacing as: sqlite3.DatabaseError: malformed database schema (messages_fts) - table messages_fts already exists SQLite parses the whole schema while preparing the FIRST statement on a connection, so on this class every statement fails before it runs: PRAGMA journal_mode (which is where SessionDB.__init__ actually trips, in apply_wal_with_fallback, BEFORE _init_schema), PRAGMA integrity_check, and even DROP TABLE. The only operations that still work are PRAGMA writable_schema=ON plus direct sqlite_master surgery. A plain FTS-index rebuild at the _init_schema layer therefore cannot reach or fix this; the canonical sessions/messages rows are intact — only the derived schema is broken. Add a dedicated recovery that operates where the failure actually happens: - hermes_state.repair_state_db_schema(): backs up the raw file first, then a least-destructive ladder — (1) de-duplicate sqlite_master keeping the lowest rowid per object (preserves the existing FTS index), escalating to (2) drop every messages_fts* schema object + VACUUM and let the next open rebuild the FTS index from messages. sessions/messages are never modified. Plus is_malformed_db_error() to discriminate this class. - SessionDB.__init__ auto-heals: on a malformed-schema open error it repairs once (process-guarded against loops / concurrent web_server opens) and reopens, so Desktop/Dashboard recover on their own instead of silently showing "no sessions". - hermes doctor --fix detects the malformed class and repairs it (reporting the recovered session count + backup name). - hermes sessions repair [--check-only] [--no-backup] runs on the raw file path, since SessionDB() itself cannot open a malformed DB. Supersedes #32589 and #33869: both targeted FTS corruption but gated their repair behind statements (integrity_check / SELECT / DROP TABLE) that themselves fail on this class, and neither addressed the apply_wal_with_fallback open-time failure. Credit preserved via Co-authored-by. Closes #33865. Co-authored-by: João Vitor Cunha <145560011+plcunha@users.noreply.github.com> Co-authored-by: Tuna Dev <273476039+tuancookiez-hub@users.noreply.github.com> * test(state.db): cover strat-B escalation + unrepairable safe-fail paths --------- Co-authored-by: João Vitor Cunha <145560011+plcunha@users.noreply.github.com> Co-authored-by: Tuna Dev <273476039+tuancookiez-hub@users.noreply.github.com>	2026-06-09 18:49:08 -05:00
Teknium	57c6714995	fix(models): keep curated Anthropic aliases in /model picker (#43103 ) The Anthropic picker returned the live /v1/models dump verbatim whenever credentials were configured. Anthropic's API lags newly-routed curated aliases (e.g. claude-fable-5, reachable on Anthropic before the models endpoint enumerates it), so the curated entry vanished from the picker. Merge curated _PROVIDER_MODELS["anthropic"] with the live catalog — curated first, live-only appended, deduped — mirroring the OpenAI curated-merge path. Live failure / no creds falls back to curated verbatim.	2026-06-09 14:45:19 -07:00
brooklyn!	8d71c38919	fix(desktop): rebind sessions after websocket reconnect (salvage of #41740 ) (#43004 ) * fix(desktop): rebind sessions after websocket reconnect * docs(desktop): explain the reconnect-resume guard in use-route-resume The reconnect fix turns on two subtle conditions with no inline rationale: `seenGatewayStateRef` suppresses a spurious "became open" on the first effect run (so a session mounting with the gateway already open doesn't double-resume), and the `gatewayBecameOpen \|\|` arm forces a re-resume even when the route looks `alreadyActive` because the cached runtime id can be stale after the gateway rebinds/reaps the session. Comment both so the next reader doesn't "simplify" them back into the original bug. No behavior change. --------- Co-authored-by: Josh Dow <josh.dow@prepad.io>	2026-06-09 19:01:00 +00:00
Siddharth Balyan	46fedef07f	fix(openrouter): never send reasoning field for adaptive Anthropic models (#43012 ) The previous fix (#42991) only omitted reasoning when it was being disabled. But reasoning-mandatory Anthropic models (Claude 4.6+, fable) 400 with thinking.type.disabled on EVERY tool-continuation turn even when reasoning is enabled: chat_completions never replays signed thinking blocks, so the prior assistant tool_call has no thinking, and OpenRouter resolves "reasoning requested but history has none" by emitting thinking.type.disabled — which these models reject. Result: first turn works, every turn after the first tool call dies (HTTP 400, non-retryable). OpenRouter ignores reasoning.effort for adaptive Anthropic models anyway (the model self-decides), so the reasoning field is pointless for them on every turn and harmful on tool-replay turns. Omit it entirely → adaptive default. - openrouter profile: drop the reasoning field for reasoning-mandatory Anthropic models regardless of enabled/disabled; legacy Anthropic + non-Anthropic models unchanged. - tests: assert omission across enabled/disabled/effort variants; parity tests switched to a non-Anthropic reasoning model (deepseek) since Anthropic 4.6+ no longer carries a reasoning field. Verified live end-to-end: a tool-replay turn on anthropic/claude-fable-5 with reasoning enabled now builds extra_body=None and returns HTTP 200 (was 400).	2026-06-10 00:18:23 +05:30
brooklyn!	ba44de06da	fix(install): self-heal a stuck Electron download (salvage of #42894 ) (#42998 ) * fix(install): self-heal a stuck Electron download on the desktop build The desktop build downloads Electron (~114MB) from GitHub. A corrupt cached zip, or a blocked/throttled GitHub release host (the repeating "retrying" log), hard-failed the install — and install.sh had no recovery at all while install.ps1 / `hermes desktop` only purged the cache. All three build paths now escalate on a failed `npm run pack`: GitHub → purge corrupt electron-.zip + stale -unpacked and retry → one retry via a public Electron mirror (npmmirror.com). @electron/get SHASUM-verifies the download, and a user-pinned ELECTRON_MIRROR is always respected (never overridden). Adds a bash clear_electron_build_cache()/_desktop_pack() to mirror the existing PowerShell/Python helpers. * test(install): cover the Electron mirror fallback Verify `hermes desktop` falls back to a mirror when the cache purge finds nothing, and that a user-pinned ELECTRON_MIRROR is respected (no extra attempt, not overridden). * docs(desktop): troubleshoot a stuck Electron download Document the automatic cache-purge + mirror fallback, how to pin your own ELECTRON_MIRROR, and how to clear a corrupt cached zip by hand. * docs(install): correct the Electron mirror trust framing The mirror-fallback comments and the desktop troubleshooting doc implied `@electron/get`'s SHASUM check makes the npmmirror.com download safe against tampering. It doesn't: the SHASUMS256.txt is fetched from the same mirror, so the check guards against a corrupt/partial download, not a compromised mirror. Reframe all four surfaces (install.sh, install.ps1, `hermes desktop`, and the docs) to state the trust trade-off honestly — npmmirror.com is the de-facto Electron community mirror, we only fall back to it after the canonical GitHub download fails, and a user-pinned ELECTRON_MIRROR is never overridden. No behavior change. --------- Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com>	2026-06-09 18:19:14 +00:00
Siddharth Balyan	1febb08240	fix(anthropic): default new Claude models to the modern thinking contract (#42991 ) New Anthropic models without a recognized version substring (claude-fable-5 and future named/numbered releases) were classified as legacy and routed down the manual-thinking path, which made OpenRouter emit thinking.type.disabled — a form reasoning-mandatory Claude models reject with a non-retryable HTTP 400. Invert the brittle version-substring allowlists to default-to-modern (mirroring _get_anthropic_max_output): unknown Claude models get the adaptive/xhigh/ no-sampling contract, with an explicit legacy list for older families. Non-Claude Anthropic-Messages models (minimax, qwen3, …) keep the manual path. - anthropic_adapter: _supports_adaptive_thinking / _supports_xhigh_effort / _forbids_sampling_params now default unknown Claude models to modern; legacy families enumerated in _LEGACY_MANUAL_THINKING_CLAUDE_SUBSTRINGS. - openrouter profile: omit reasoning entirely (→ adaptive default) instead of forwarding {enabled:false} for reasoning-mandatory Anthropic models; legacy Anthropic + all non-Anthropic models still pass the disable form through. - model_metadata + output-limit table: register claude-fable-5 (1M ctx, 128K out). Tests assert the invariant ("unknown Claude model -> modern contract; legacy stays manual; non-Claude unaffected"), not specific model names.	2026-06-09 23:37:23 +05:30
Frowte3k	39b76d9013	fix(packaging): ship optional-mcps catalog in wheel and sdist (#39859 ) The shipped MCP catalog (optional-mcps/) wasn't packaged, so `hermes mcp catalog` and the dashboard catalog screen come up empty on pip/Homebrew/Nix installs even though the manifests exist in the repo. The runtime expects a packaged catalog (get_optional_mcps_dir() -> _get_packaged_data_dir("optional-mcps"); list_catalog() returns [] when it's absent). Ship it like locales: pyproject [tool.setuptools.data-files] for the wheel + a MANIFEST.in graft for the sdist. optional-mcps/ is nested (optional-mcps/<name>/manifest.yaml) and data-files flattens each glob into its target dir, so each catalog entry gets its own target to preserve the per-entry directory the catalog iterates over.	2026-06-09 14:03:20 -04:00
Teknium	967c325da8	fix(models): read OpenRouter live context_length before hardcoded catch-all (#42986 ) OpenRouter-routed slugs that are absent from models.dev (e.g. a freshly shipped anthropic/claude-fable-5) fell through to the generic DEFAULT_CONTEXT_LENGTHS["claude"]=200K entry and under-reported their real 1M window. The step-6 OpenRouter live-metadata fallback was gated on `not effective_provider`, but an OpenRouter selection sets effective_provider="openrouter" (inferred from the base URL), so that branch was dead code for every OR model. Add a dedicated step-5 OpenRouter branch that consults the live /models catalog (authoritative, refreshes as new slugs ship) before models.dev and the hardcoded family defaults — mirroring the existing Nous/Copilot/GMI branches. Keeps the Kimi-family 32k underreport guard. Per-model values are respected (claude-haiku-4.5 stays 200K), so it does not blanket-bump to 1M. Regression tests cover the fable-5 case, the genuinely-200k case, and the Kimi guard.	2026-06-09 10:49:32 -07:00
Teknium	f6f573ebaa	feat(plugins): install from a subdirectory within a repo (#42963 ) Support installing a plugin that lives in a subdirectory of a larger repo (docs/tests at root, plugin in a subdir) without forcing a dedicated single-plugin repo. Identifier syntax: owner/repo/path/to/plugin (shorthand + subpath) <url>.git/path/to/plugin (.git boundary on GitHub-style URLs) <url>#path/to/plugin (explicit fragment, any scheme) _resolve_git_url now returns (git_url, subdir); _install_plugin_core reads the manifest from and moves only the subdir, so root-level docs and tests no longer leak into ~/.hermes/plugins. _resolve_subdir_within guards against path traversal, missing dirs, and non-directories. Both the CLI (hermes plugins install) and the dashboard install endpoint inherit this for free since they share _install_plugin_core. Dashboard install hint + placeholder updated to advertise the subdir syntax. Co-authored-by: Austin Pickett <pickett.austin@gmail.com>	2026-06-09 13:42:51 -04:00
Gille	c6dc2fcd21	fix(desktop): release profile backends before delete (#42613 )	2026-06-09 10:52:02 -05:00
Philip D'Souza	92dfd70d6a	fix(photon): production hardening for the gRPC-native iMessage channel (#42732 ) * fix(photon): override transitive CVEs in the sidecar deps `npm audit` flagged 7 high-severity transitive CVEs (protobufjs code injection GHSA-66ff-xgx4-vchm + outdated @opentelemetry OTLP exporters) pulled in via spectrum-ts -> @photon-ai/otel. npm's suggested fix downgrades spectrum-ts to a version that targets the decommissioned spectrum host, so instead pin patched versions via `overrides` (protobufjs 8.6.1, @opentelemetry/* 0.218.0) without touching spectrum-ts. `npm audit` -> 0; spectrum-ts + provider still import. * fix(photon): harden the sidecar bridge + bound the dedup cache - constant-time sidecar control-token comparison (was `!==`, timing-attackable). - cap the control-channel request body (2 MiB) so a compromised local peer can't OOM the sidecar. - wrap the inbound gRPC stream consumer in a re-subscribe loop with capped exponential backoff + jitter — if the async iterator throws/ends it would otherwise stop inbound forever (the adapter dedupes any replay). - add an unhandledRejection handler so a stray rejection logs instead of killing the process. - dedup cache (adapter) was a true bounded LRU only for expired entries; a burst of unique ids within the window grew it without limit. Evict oldest at the cap. * chore: add AUTHOR_MAP entry for PhilipAD --------- Co-authored-by: PhilipAD <philipadsouza@gmail.com>	2026-06-09 11:12:58 -04:00
Brian D. Evans	b5421f4ba6	fix(deps): declare packaging as a core dependency so it ships everywhere (#40522 ) * fix(deps): declare packaging as a core dependency so it ships everywhere packaging is imported directly on three production paths but was never declared in [project.dependencies], so it only reached users transitively (pip/uv pull it for other tools). The slim official Docker image ships without it, where each try/except-ImportError fallback silently degrades: - plugins/memory/hindsight/__init__.py (_meets_minimum_version) returns False when packaging is absent, disabling update_mode='append' so every session leaks separate Hindsight documents (the reported #40503 symptom). - tools/lazy_deps.py (_is_satisfied) falls back to "installed counts as satisfied", defeating every version-constraint check on lazy extras. - hermes_cli/main.py drops to naive name==version requirement parsing. Promote it to a declared core dep pinned to packaging==26.0 — the exact version already resolved in uv.lock, so there is zero resolution churn (the lock change is two edge annotations marking it transitive->direct). It is a pure-Python py3-none-any wheel with no compiled extensions, safe to ship on every platform. Declaring it also wires it into the _verify_core_dependencies_installed() update-repair guard, which reinstalls missing [project.dependencies] on hermes update. Adds a hermetic tomllib-parse regression test that fails before the declaration and passes after. Fixes #40503 * test(deps): make packaging dep-name extraction PEP 508-robust Address Copilot review on #40522: the inline name-extraction only handled ==, >=, [ and ; and could mis-parse valid requirement strings using <=, ~=, !=, <, > or a direct reference (name @ url). Factor a _distribution_name helper that drops markers, direct-reference URLs and extras, then strips any version operator via regex, so a future dep declared with any PEP 508 specifier shape is matched correctly. --------- Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>	2026-06-09 11:11:48 -04:00
xxxigm	57775e9e16	test(agent): cover char-based output-cap overflow parsing (#42741 ) Add TestParseCharBasedOutputCap for the LM Studio / llama.cpp phrasing (context in tokens, prompt in characters): the reported error resolves to the available output budget, the retried cap plus the estimated input stays inside the window, and a prompt larger than the window falls through to None so the prompt-too-long/compression path still owns that case.	2026-06-09 03:17:12 -07:00
teknium1	24a934295f	test(yuanbao): add missing patch import to pipeline tests The salvaged refactor's new tests use unittest.mock.patch (25 call sites) but the import line only brought in AsyncMock and MagicMock, so 10 of the new tests failed with NameError. Add patch to the import.	2026-06-09 03:17:00 -07:00
loongzhao	ffcd9d7ac7	refactor(yuanbao): consolidate media resolution into dedicated pipeline middlewares	2026-06-09 03:17:00 -07:00
JP Lew	cb4cc08b0a	fix(codex): record app-server token usage in session accounting	2026-06-09 02:46:04 -07:00
kshitij	85852b71d8	fix(nemo-relay): preserve downstream errors in adaptive execution (#42691 ) Based on #42658 by @mnajafian-nv. Preserves the real downstream provider/tool exception when NeMo Relay's managed adaptive execution wraps a failing callback as an internal runtime error. Without this, the original exception (and its retry-classification signal, e.g. status_code) is lost behind Relay's wrapper. Salvage changes on top of the original PR: - Tolerant Relay-wrapper match: _is_relay_wrapped_callback_error now uses str.startswith on the "internal error: <cls>: <msg>" prefix instead of exact equality, so a future Relay version appending a traceback/suffix doesn't silently defeat the unwrap. On a total format change it returns False and falls back to the pre-fix behavior (surfacing Relay's error) rather than masking it. - Deduplicated the LLM and tool execute paths into a shared _run_managed_with_downstream_preservation helper, removing ~20 lines of copy-pasted nonlocal/try-except scaffolding that could drift out of sync. - Added a real-middleware regression guard (test_nemo_relay_downstream_unwrap_matches_real_middleware_wrapper_shape) that drives hermes_cli.middleware._run_execution_chain and asserts the plugin's _original_downstream_error unwraps the actual private _DownstreamExecutionError wrapper. The original synthetic tests modeled the wrapper with a local class, so a rename or shape change in core middleware would not have been caught; this test fails loudly if that contract drifts. Co-authored-by: mnajafian-nv <mnajafian@nvidia.com>	2026-06-09 02:31:10 -07:00
Teknium	8d99b5bc4f	fix(gateway): cap terminal code-block preview in non-verbose mode (#42729 ) The markdown code-block change rendered args['command'] in full in both verbose AND non-verbose (all/new) modes, so a long or multi-line terminal command bypassed the tool_preview_length cap (default 40) and rendered as a huge block. Non-verbose now collapses to a single line capped at the preview length while keeping the fence; verbose keeps the full command.	2026-06-09 02:28:47 -07:00
kshitij	a38cc69bcc	fix(terminal): complete sane PATH entries on POSIX (salvage of #35614 ) (#42653 ) * fix(terminal): complete sane PATH entries on POSIX Fixes macOS gateway/launchd terminal sessions whose PATH already includes /usr/bin while omitting Apple Silicon Homebrew paths. LocalEnvironment._make_run_env() now appends each missing _SANE_PATH entry individually on POSIX, preserving caller precedence and avoiding duplicate sane entries. Root cause: the previous logic used /usr/bin as the sentinel for sane PATH injection. macOS launchd commonly provides /usr/bin while leaving out /opt/homebrew/bin and /opt/homebrew/sbin, so Homebrew-installed CLIs stayed unavailable in terminal tool calls. Salvaged from #35614 by @y0shua1ee. Fixes #35613. Co-authored-by: y0shua1ee <104712437+y0shua1ee@users.noreply.github.com> * test(terminal): harden sane PATH completion against dup/empty entries Follow-up to the #35613 fix. Strengthens _append_missing_sane_path_entries: - De-duplicate the caller-supplied PATH (first occurrence wins) so a PATH that already contains duplicate entries is collapsed rather than carried through. Previously only newly-appended sane entries were guarded against duplication; pre-existing caller duplicates were preserved verbatim. - Drop empty PATH entries (leading/trailing/double ':'), which POSIX shells interpret as the current working directory — a mild foot-gun in a default terminal environment. Behaviour for well-formed PATHs (no duplicates, no empty entries) is byte-identical to before; only malformed/duplicated inputs change. Adds regression tests for: the literal macOS launchd PATH (/usr/bin:/bin:/usr/sbin:/sbin), caller-duplicate collapsing with order preservation, and empty-entry stripping. * docs(terminal): clarify PATH normalisation semantics; drop dead set add Addresses review findings on the sane-PATH completion follow-up: - Sharpen the _append_missing_sane_path_entries docstring to state explicitly that on POSIX the caller PATH is rewritten (empty entries stripped, duplicates collapsed) rather than merely appended to, and that well-formed PATHs remain byte-identical bar the appended sane entries. This makes the intentional semantic change visible rather than buried under "hardening". - Document why _path_env_key is a deliberate second Windows guard distinct from the helper's early return (key-casing selection vs standalone safety), so neither is mistaken for redundant and removed. - Drop the dead `seen.add(entry)` in the sane-entry loop: _SANE_PATH is a static duplicate-free constant, so the membership check against the caller entries is sufficient and `seen` is never read afterwards. No behaviour change: verified byte-identical output across the launchd, minimal, empty, duplicate, empty-entry and already-full cases, and re-confirmed gh/brew resolve through the real LocalEnvironment.execute() path under a launchd-style PATH. 133 targeted tests pass. Intentionally NOT consolidating with tools/browser_tool._merge_browser_path: it prepends (vs append), filters on os.path.isdir, uses os.pathsep, and draws from a dynamic candidate set — a shared helper is a separate refactor, out of scope for this bugfix. --------- Co-authored-by: y0shua1ee <104712437+y0shua1ee@users.noreply.github.com>	2026-06-09 02:21:12 -07:00
kshitij	76f89d66de	fix(test): track TERMINAL_CONFIG_ENV_MAP after env-sync consolidation (#42695 ) `test_terminal_config_env_sync.py::_save_config_env_sync_keys()` AST-scanned `hermes_cli/config.py:set_config_value` for a `_config_to_env_sync = {...}` literal. The terminal-config env bridging was consolidated onto the canonical `TERMINAL_CONFIG_ENV_MAP` (now read via `terminal_config_env_var_for_key()`), so that literal no longer exists and the scanner raised: AssertionError: Could not find `_config_to_env_sync = {...}` literal in source failing 8 of 9 tests on main for every PR. Read the live `TERMINAL_CONFIG_ENV_MAP` instead — the actual source of truth `set_config_value` bridges through — mirroring its `terminal.cwd` exclusion. Refresh the stale module docstring and the now-incorrect error-message hints that still referenced `_config_to_env_sync`. Verified: the suite goes green, and a mutation (dropping `docker_volumes` from `TERMINAL_CONFIG_ENV_MAP`) still trips the pinned regression test, so the drift guard retains its teeth.	2026-06-09 02:11:46 -07:00

1 2 3 4 5 ...

5226 commits