ci: make dependency installs resilient to transient flakes

`npm ci` / `uv sync` / toolchain header fetches occasionally die on
transient network blips — e.g. node-pty's node-gyp fetching Node headers
(an undici assert) during the typecheck job's `npm ci`, which killed the job
before `tsc` ever ran. "Re-run and it goes green" is exactly what CI should
do itself.

- New reusable `.github/actions/retry` composite action wraps a command and
  retries on failure (3x / 10s, command passed via env so it can't inject).
  Applied to every PR-path network install: npm ci (typecheck, desktop
  build, docs site), uv sync (tests, e2e), uv tool install (lint),
  pip install (docs site).
- typecheck now runs `npm ci --ignore-scripts`: `tsc` needs only sources +
  type defs, so skipping install scripts drops node-pty's native rebuild
  (whose header fetch was the flake) and is faster. Validated locally — tsc
  passes for ui-tui, apps/shared, and apps/desktop with scripts skipped.
- ripgrep download uses `curl --retry`.

Docker (main-only) and the release/windows workflows are intentionally left
for a follow-up.
This commit is contained in:
Brooklyn Nicholson 2026-06-19 17:05:34 -05:00 committed by ethernet
parent 2977e74543
commit 56b4ef74a6
5 changed files with 83 additions and 13 deletions

50
.github/actions/retry/action.yml vendored Normal file
View file

@ -0,0 +1,50 @@
name: Retry a flaky command
description: >-
Run a shell command, retrying on non-zero exit. For dependency installs
(npm ci, uv sync) whose only failures are transient network/toolchain
flakes — a node-gyp header fetch, a registry blip — so CI self-heals
instead of needing a manual re-run.
inputs:
command:
description: Shell command to run (and retry).
required: true
attempts:
description: Max attempts before giving up.
default: "3"
delay:
description: Seconds to wait between attempts.
default: "10"
working-directory:
description: Directory to run in.
default: "."
runs:
using: composite
steps:
- shell: bash
working-directory: ${{ inputs.working-directory }}
# command goes through env, never interpolated into the script body, so
# a command with quotes/specials can't break or inject into the runner.
env:
_CMD: ${{ inputs.command }}
_ATTEMPTS: ${{ inputs.attempts }}
_DELAY: ${{ inputs.delay }}
run: |
set -uo pipefail
n=0
while :; do
n=$((n + 1))
echo "::group::attempt $n/$_ATTEMPTS: $_CMD"
if bash -c "$_CMD"; then
echo "::endgroup::"
exit 0
fi
echo "::endgroup::"
if [ "$n" -ge "$_ATTEMPTS" ]; then
echo "::error::failed after $n attempts: $_CMD"
exit 1
fi
echo "::warning::attempt $n failed; retrying in ${_DELAY}s: $_CMD"
sleep "$_DELAY"
done