From d02a59b67997b9533eb786633604b1bec9c24bf9 Mon Sep 17 00:00:00 2001 From: Siddharth Balyan <52913345+alt-glitch@users.noreply.github.com> Date: Mon, 8 Jun 2026 12:41:37 +0530 Subject: [PATCH] fix(nix): cold npm builds + fix-lockfiles real-build verification + auto-fix workflow (#41867) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix(nix): fix-lockfiles real-build verification + point auto-fix at nix/lib.nix Two related fixes to the npm lockfile-hash tooling that, together, let a broken nix build slip onto main and stay there: 1. fix-lockfiles trusted prefetch-npm-deps. It computes the hash from the lockfile *contents* and early-exited "ok" whenever that matched the pin, never running the real fetchNpmDeps + npmConfigHook build. Those two can disagree (the --apply path already works around it), so `--check` reported "ok" while a cold build was actually broken (e.g. lockfile engines/os/cpu fields the pinned nixpkgs strips from the deps cache, tripping npmConfigHook's consistency diff). Now, when prefetch says the hash matches, confirm with `nix build .#` before believing it: adopt the real fetchNpmDeps hash if nix reports a 'got:' mismatch, surface non-hash failures honestly (exit 1) instead of claiming "ok", and keep the transient-cache-failure skip. 2. nix-lockfile-fix.yml's auto-fix-main (and the PR-fix job) whitelisted and staged nix/tui.nix + nix/web.nix, but the single npmDepsHash moved to nix/lib.nix. So fix-lockfiles --apply edited nix/lib.nix, the guard flagged it as an "unexpected modified file", and the job exited without committing — the auto-healer could never push a fix. Point the guard regex and both `git add` lines at nix/lib.nix. * fix(nix): fix cold npm builds — adopt the deps-cache lockfile in patchPhase hermes-tui/hermes-agent could not be built from source on the pinned nixpkgs: prefetch-npm-deps strips advisory lockfile fields (engines/os/cpu/funding/ bin/…) that newer npm writes into package-lock.json, then npmConfigHook byte-compares the source lockfile against the cache's stripped copy and fails on the difference. CI only stayed green because it substitutes the prebuilt hermes-tui from Cachix and never cold-builds it; anyone building cold (e.g. a local path: input, or a cache miss) hit the failure. mkNpmPassthru's patchPhase now copies the cache's own normalized package-lock.json over the source before npmConfigHook runs, so the consistency check is trivially satisfied. The resolved dependency set (version/resolved/integrity/dependencies) is identical — fetchNpmDeps derived the cache from this very lockfile — so `npm ci` installs the same tree; only advisory metadata is dropped. Genuine drift is still caught by the fixed-output npmDepsHash check, which runs before this phase. Verified by cold-building .#tui and .#default (full hermes-agent) from scratch on the pinned nixpkgs (6201e2) — both succeed where they previously failed at npmConfigHook. --- .github/workflows/nix-lockfile-fix.yml | 11 ++-- nix/lib.nix | 75 +++++++++++++++++++------- 2 files changed, 62 insertions(+), 24 deletions(-) diff --git a/.github/workflows/nix-lockfile-fix.yml b/.github/workflows/nix-lockfile-fix.yml index ada0b79f23c..b83b0ba3d3f 100644 --- a/.github/workflows/nix-lockfile-fix.yml +++ b/.github/workflows/nix-lockfile-fix.yml @@ -75,9 +75,10 @@ jobs: run: | set -euo pipefail - # Ensure only nix files were modified — prevents accidental - # self-triggering if fix-lockfiles ever touches package files. - unexpected="$(git diff --name-only | grep -Ev '^nix/(tui|web)\.nix$' || true)" + # Ensure only nix/lib.nix (home of the single npmDepsHash) was + # modified — prevents accidental self-triggering if fix-lockfiles + # ever touches package files. + unexpected="$(git diff --name-only | grep -Ev '^nix/lib\.nix$' || true)" if [ -n "$unexpected" ]; then echo "::error::Unexpected modified files: $unexpected" exit 1 @@ -89,7 +90,7 @@ jobs: git config user.name 'github-actions[bot]' git config user.email '41898282+github-actions[bot]@users.noreply.github.com' - git add nix/tui.nix nix/web.nix + git add nix/lib.nix git commit -m "fix(nix): auto-refresh npm lockfile hashes" \ -m "Source: $GITHUB_SHA" \ -m "Run: $GITHUB_SERVER_URL/$GITHUB_REPOSITORY/actions/runs/$GITHUB_RUN_ID" @@ -216,7 +217,7 @@ jobs: set -euo pipefail git config user.name 'github-actions[bot]' git config user.email '41898282+github-actions[bot]@users.noreply.github.com' - git add nix/tui.nix nix/web.nix + git add nix/lib.nix git commit -m "fix(nix): refresh npm lockfile hashes" git push diff --git a/nix/lib.nix b/nix/lib.nix index ce144537222..7c630ca9b8e 100644 --- a/nix/lib.nix +++ b/nix/lib.nix @@ -73,24 +73,25 @@ in patchPhase = '' runHook prePatch - # Normalize trailing newlines on the root lockfile so source and - # npm-deps always match, regardless of what fetchNpmDeps preserves. - sed -i -z 's/\\n*$/\\n/' package-lock.json - # Make npmConfigHook's byte-for-byte diff newline-agnostic by - # replacing its hardcoded /nix/store/.../diff with a wrapper that - # normalizes trailing newlines on both sides before comparing. - mkdir -p "$TMPDIR/bin" - cat > "$TMPDIR/bin/diff" << DIFFWRAP - #!/bin/sh - f1=\\$(mktemp) && sed -z 's/\\n*$/\\n/' "\\$1" > "\\$f1" - f2=\\$(mktemp) && sed -z 's/\\n*$/\\n/' "\\$2" > "\\$f2" - ${pkgs.diffutils}/bin/diff "\\$f1" "\\$f2" && rc=0 || rc=\\$? - rm -f "\\$f1" "\\$f2" - exit \\$rc - DIFFWRAP - chmod +x "$TMPDIR/bin/diff" - export PATH="$TMPDIR/bin:$PATH" + # prefetch-npm-deps stores a *normalized* package-lock.json in the deps + # cache: newer npm writes advisory fields (engines/os/cpu/funding/bin/…) + # into lockfile entries, and prefetch strips the ones that don't affect + # which tarballs are fetched. npmConfigHook then does a byte-for-byte + # diff of the source lockfile against the cache's copy and fails on + # those purely-cosmetic differences — this is what breaks cold builds + # on a nixpkgs whose prefetch-npm-deps strips fields the committed + # lockfile carries. + # + # Adopt the cache's own normalized lockfile as the source so the + # consistency check is trivially satisfied. The resolved dependency set + # (version/resolved/integrity/dependencies) is byte-identical either + # way — fetchNpmDeps derived the cache *from* this lockfile — so `npm + # ci` installs exactly the same tree; only advisory metadata is dropped. + # Genuine drift is still caught upstream: a changed lockfile that didn't + # get its npmDepsHash refreshed fails the fixed-output hash check before + # this phase ever runs. + cp --no-preserve=mode,ownership ${npmDeps}/package-lock.json package-lock.json runHook postPatch ''; @@ -233,9 +234,45 @@ in OLD_HASH=$(grep -oE 'npmDepsHash = "sha256-[^"]+"' "$LIB_FILE" | head -1 \ | sed -E 's/npmDepsHash = "(.*)"/\1/') + # prefetch-npm-deps says the hash already matches — but it only hashes the + # lockfile *contents* and can disagree with fetchNpmDeps + npmConfigHook, + # which validate the full source lockfile against the realized deps cache. + # Trusting prefetch alone produced false "ok" results while the actual + # build was broken (e.g. lockfile engines/os/cpu fields the pinned nixpkgs + # strips from the deps cache, tripping npmConfigHook). So when prefetch + # claims the hash is current, confirm with a real consumer build before + # believing it. if [ "$NEW_HASH" = "$OLD_HASH" ]; then - echo "ok" - exit 0 + if VERIFY_OUT=$(nix build ".#${attr}" --no-link --print-build-logs 2>&1); then + echo "ok" + if [ -n "''${GITHUB_OUTPUT:-}" ]; then + { echo "stale=false"; echo "changed=false"; } >> "$GITHUB_OUTPUT" + fi + exit 0 + fi + # Build failed despite a matching hash. A fixed-output 'got:' means + # prefetch genuinely disagreed with fetchNpmDeps — adopt the real hash + # and fall through to the stale-handling path below. + CORRECT_HASH=$(echo "$VERIFY_OUT" | awk '/got:/ {print $2; exit}') + if [ -n "$CORRECT_HASH" ]; then + echo "prefetch-npm-deps reported current ($OLD_HASH) but fetchNpmDeps wants $CORRECT_HASH" >&2 + NEW_HASH="$CORRECT_HASH" + elif echo "$VERIFY_OUT" | grep -qE "throttled|HTTP error 418|substituter .* is disabled|some outputs of .* are not valid"; then + echo "skipped (transient cache failure — see primary nix build for real status)" >&2 + echo "$VERIFY_OUT" | tail -8 >&2 + exit 0 + else + # Not a stale-hash problem — surface it honestly instead of "ok". + echo "::error::nix build .#${attr} failed and it is NOT a stale npmDepsHash (no 'got:' hash in output)." >&2 + echo "The committed lockfile may be incompatible with the pinned nixpkgs" >&2 + echo "(e.g. engines/os/cpu fields that prefetch-npm-deps strips from the" >&2 + echo "deps cache, tripping npmConfigHook). fix-lockfiles cannot repair this." >&2 + echo "$VERIFY_OUT" | tail -40 >&2 + if [ -n "''${GITHUB_OUTPUT:-}" ]; then + { echo "stale=false"; echo "changed=false"; } >> "$GITHUB_OUTPUT" + fi + exit 1 + fi fi HASH_LINE=$(grep -n 'npmDepsHash = "sha256-' "$LIB_FILE" | head -1 | cut -d: -f1)