fix(windows): %1 install error, patch CRLF false-negative, SOUL.md BOM

Three bugs from teknium1's successful install + diagnostic chat on Windows: 1. **Start-Process -FilePath npm.cmd fails with "%1 is not a valid Win32 application".** Start-Process bypasses cmd.exe and PATHEXT to call CreateProcessW directly, which refuses .cmd batch shims. Switched Install-NodeDeps to use PowerShell's invocation operator (``& $npmExe install --silent *> $log``) which DOES honour PATHEXT. Extracted a ``_Run-NpmInstall`` helper so the browser + TUI paths share the same logic. Captures $LASTEXITCODE correctly, still surfaces the real stderr on failure with a log-file pointer for the full output. 2. **patch tool returns false-negative on Windows due to CRLF round-trip.** Root cause was upstream of patch: ``subprocess.Popen(..., text=True, stdin=PIPE)`` on Windows translates ``\\n`` → ``\\r\\n`` when data flows through the stdin pipe. ``_pipe_stdin()`` was writing the patch's new_content string through a text-mode pipe, bash then wrote those CRLF bytes to disk, and patch's post-write verify compared the on-disk CRLF bytes against the original LF-only string — fail. Fixed in two places for defense in depth: - ``_pipe_stdin()`` now writes through ``proc.stdin.buffer`` with explicit UTF-8 encoding, bypassing Python's newline translation on every platform. No behaviour change on POSIX (bytes are identical) but stops the CRLF injection on Windows. - ``patch_replace``'s post-write verify normalizes CRLF→LF on both sides before comparing, so even if some future backend still translates newlines the patch tool won't report a bogus failure. 3. **SOUL.md gets a UTF-8 BOM on Windows PowerShell 5.1.** ``Set-Content -Encoding UTF8`` on PS5.1 writes UTF-8 WITH a byte-order-mark (changed in PS7 via ``utf8NoBOM``). Hermes's prompt-injection scanner sees the BOM (U+FEFF invisible char) and refuses to load the file, so SOUL.md's persona instructions never get applied. Fixed by writing the file via ``[System.IO.File]::WriteAllText`` with an explicit ``UTF8Encoding($false)`` — BOM-free on every PowerShell version. All POSIX behaviour verified unchanged: 198 tests pass across test_file_operations, test_local_env_cwd_recovery, test_code_execution, test_windows_native_support, test_windows_compat.
2026-05-10 03:22:05 +00:00 · 2026-05-07 18:11:43 -07:00 · 2026-05-07 18:11:43 -07:00 · 8f91d7bfa9
commit 8f91d7bfa9
parent d52e54170a
3 changed files with 97 additions and 75 deletions
--- a/tools/environments/base.py
+++ b/tools/environments/base.py
@ -99,12 +99,33 @@ def get_sandbox_dir() -> Path:


 def _pipe_stdin(proc: subprocess.Popen, data: str) -> None:
-    """Write *data* to proc.stdin on a daemon thread to avoid pipe-buffer deadlocks."""
+    """Write *data* to proc.stdin on a daemon thread to avoid pipe-buffer deadlocks.
+
+    On Windows, text-mode stdin (``text=True`` / ``encoding="utf-8"``)
+    translates ``\\n`` → ``\\r\\n`` as the data flows through the pipe —
+    which corrupts every write_file / patch call because the bytes that
+    land on disk include injected carriage returns.  The file IS created,
+    but every subsequent byte-count / content compare against the
+    caller's ``\\n``-only string fails.
+
+    Workaround: write through ``proc.stdin.buffer`` (the underlying byte
+    buffer), encoding to UTF-8 ourselves.  That bypasses Python's
+    newline translation entirely on every platform.  No behaviour change
+    on POSIX — the byte sequence is identical to what text-mode would
+    produce there.
+    """

    def _write():
        try:
-            proc.stdin.write(data)
-            proc.stdin.close()
+            # proc.stdin is a TextIOWrapper when text=True was set on the
+            # Popen.  Its ``.buffer`` attribute is the raw BufferedWriter
+            # that bypasses newline translation.  When Popen was created
+            # in byte mode, proc.stdin is already a BufferedWriter with
+            # no ``.buffer`` attribute — fall back to .write() directly.
+            raw = data.encode("utf-8") if isinstance(data, str) else data
+            target = getattr(proc.stdin, "buffer", proc.stdin)
+            target.write(raw)
+            target.close()
        except (BrokenPipeError, OSError):
            pass

--- a/tools/file_operations.py
+++ b/tools/file_operations.py
@ -966,11 +966,21 @@ class ShellFileOperations(FileOperations):
        verify_result = self._exec(verify_cmd)
        if verify_result.exit_code != 0:
            return PatchResult(error=f"Post-write verification failed: could not re-read {path}")
-        if verify_result.stdout != new_content:
+        # Normalize line endings before comparing.  On Windows, Python's
+        # default text-mode ``open()`` translates ``\n`` → ``\r\n`` on
+        # write, so the file on disk legitimately holds CRLFs while our
+        # ``new_content`` string has bare LFs.  Without this normalization
+        # every patch on Windows returns a bogus "wrote 39, read 42"
+        # false-negative even though the edit landed correctly.  POSIX
+        # backends don't translate, so this is a no-op there.
+        _verify_stdout_normalized = verify_result.stdout.replace("\r\n", "\n").replace("\r", "\n")
+        _new_content_normalized = new_content.replace("\r\n", "\n").replace("\r", "\n")
+        if _verify_stdout_normalized != _new_content_normalized:
            return PatchResult(error=(
                f"Post-write verification failed for {path}: on-disk content "
                f"differs from intended write "
-                f"(wrote {len(new_content)} chars, read back {len(verify_result.stdout)}). "
+                f"(wrote {len(_new_content_normalized)} chars, read back "
+                f"{len(_verify_stdout_normalized)} chars after normalizing line endings). "
                "The patch did not persist. Re-read the file and try again."
            ))