feat(patch): indentation preservation, CRLF preservation, per-file failure escalation (#507) (#32273)

Three granular patch-tool refinements from the Roo Code deep-dive (#507).

## Indentation preservation (fuzzy_match.py)

When fuzzy_find_and_replace matches via a non-exact strategy, the file's
indentation may differ from what the LLM sent in old_string/new_string
(common case: model sends zero-indent old/new for a method body that
lives inside an 8-space-indented class). Before this commit the
replacement was spliced in verbatim, producing a file with a broken
indent level that may still parse but is logically wrong.

The fix computes the indent delta between old_string's first meaningful
line and the matched region's first meaningful line, then re-indents
every line of new_string by that delta. Exact-strategy matches are
untouched (passthrough). Same approach as Roo Code's
multi-search-replace.ts:466-500.

## CRLF preservation (file_operations.py)

Models nearly always send tool args with bare LF endings (JSON-encoded),
but the file on disk may have CRLF (Windows-line-ending configs, .bat,
.cmd, .ini files). Before this commit:

- write_file silently normalized CRLF to LF on every overwrite
- patch produced mixed-ending files: the substituted region had LF,
  the surrounding context kept CRLF

The fix detects the file's existing line endings (via pre_content if
already read for lint/LSP, otherwise a tiny head -c 4096 probe), and
normalizes the entire write to that ending. New files are written
verbatim (no detection possible).

## Per-file failure escalation (file_tools.py)

When the agent fails to patch the same file 3+ times in a row, the
existing 'old_string not found' hint isn't strong enough — the model
keeps retrying with variations against a stale view of the file.

The fix tracks consecutive failures per (task_id, resolved_path) and
injects an escalating hint after 3 failures: 'This is failure #N
patching X. Stop retrying. Either re-read fresh, use longer context,
or fall back to write_file.' Counter resets on a successful patch to
the same path.

## Validation

- 22 new tests across tests/tools/test_fuzzy_match.py (5),
  test_line_ending_preservation.py (12), test_patch_failure_tracking.py (5)
- All existing tests pass (165/165 in the touched files)
- E2E verified with real _handle_patch / _handle_write_file calls
  against real CRLF files and real failure loops

Closes part of #507. The remaining open items in #507 (2b start_line
hint, behavioral rules) were declined after audit:
- 2b adds schema bloat for a problem the existing 'multiple matches'
  contract already handles
- Behavioral rules conflict with the personality system

Items 1, 2d, 2e, 3, 4 of #507 were already landed in earlier work.
This commit is contained in:
Teknium 2026-05-25 15:18:45 -07:00 committed by GitHub
parent c2aa235328
commit 6bd0be30be
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 824 additions and 10 deletions

View file

@ -74,6 +74,46 @@ def _strip_terminal_fence_leaks(text: str) -> str:
return "".join(cleaned_lines)
def _detect_line_ending(sample: str) -> Optional[str]:
"""Return the dominant line ending in ``sample`` or None if undetermined.
Looks at the first few line breaks and picks ``\\r\\n`` if any are
present (Windows / DOS), otherwise ``\\n`` (Unix). Returns ``None``
for empty / single-line content where we can't tell. Used to
preserve the file's original line endings across write_file and
patch operations without this the agent's bare-LF tool args
silently normalize Windows-line-ending files, and patch produces
mixed endings when only a substituted region changes.
"""
if not sample:
return None
# Look at the first chunk — enough to tell, cheap to scan.
head = sample[:4096]
if "\r\n" in head:
return "\r\n"
if "\n" in head:
return "\n"
return None
def _normalize_line_endings(text: str, target: str) -> str:
"""Convert all line endings in ``text`` to ``target`` (``\\n`` or ``\\r\\n``).
Idempotent: ``_normalize_line_endings(_normalize_line_endings(x, "\\r\\n"), "\\r\\n") == _normalize_line_endings(x, "\\r\\n")``.
Strips lone ``\\r`` characters as well, so mixed-ending content is
homogenized in a single pass.
"""
# First collapse to LF (handle CRLF and lone CR), then expand if target
# is CRLF. Order matters: doing the replacements separately would
# double-convert a CRLF -> LFLF.
lf_normalized = text.replace("\r\n", "\n").replace("\r", "\n")
if target == "\n":
return lf_normalized
if target == "\r\n":
return lf_normalized.replace("\n", "\r\n")
return text
def _get_safe_write_root() -> Optional[str]:
"""Return the resolved HERMES_WRITE_SAFE_ROOT path, or None if unset.
@ -697,7 +737,29 @@ class ShellFileOperations(FileOperations):
"""Escape a string for safe use in shell commands."""
# Use single quotes and escape any single quotes in the string
return "'" + arg.replace("'", "'\"'\"'") + "'"
def _detect_file_line_ending(self, path: str, pre_content: Optional[str] = None) -> Optional[str]:
"""Detect the dominant line ending of a file on disk.
If ``pre_content`` is already available (we just read the file
for lint/LSP purposes), inspect that zero extra exec calls.
Otherwise issue a tiny ``head -c 4096`` to sample the first 4KB.
Returns ``"\\r\\n"`` for CRLF (Windows), ``"\\n"`` for LF (Unix),
or ``None`` if undetermined (new file, empty file, single-line
file with no line break in the first chunk).
"""
if pre_content:
return _detect_line_ending(pre_content)
# File may not exist (new write) — `head` exits 0 with empty
# stdout in that case which yields None below. Cheap probe.
head_cmd = f"head -c 4096 {self._escape_shell_arg(path)} 2>/dev/null"
head_result = self._exec(head_cmd)
if head_result.exit_code != 0 or not head_result.stdout:
return None
return _detect_line_ending(head_result.stdout)
def _unified_diff(self, old_content: str, new_content: str, filename: str) -> str:
"""Generate unified diff between old and new content."""
old_lines = old_content.splitlines(keepends=True)
@ -975,6 +1037,17 @@ class ShellFileOperations(FileOperations):
if read_result.exit_code == 0 and read_result.stdout:
pre_content = read_result.stdout
# ── Line-ending preservation (Roo Code pattern) ──────────────
# If the file existed with CRLF endings and the agent's content
# has bare LFs, convert to CRLF before writing. Otherwise the
# write silently normalizes a Windows-line-ending file (and patch
# produces mixed endings when only a substituted region changes).
# Detect from a small head sample to avoid reading the full file
# for line-ending purposes alone.
original_ending = self._detect_file_line_ending(path, pre_content)
if original_ending == "\r\n":
content = _normalize_line_endings(content, "\r\n")
# Snapshot LSP diagnostics for this file (best-effort) so the
# post-write LSP layer can return only diagnostics introduced
# by this specific edit. Mirrors claude-code's
@ -1082,6 +1155,19 @@ class ShellFileOperations(FileOperations):
except Exception:
pass
return PatchResult(error=err_msg)
# ── Line-ending preservation ──────────────────────────────────
# Models nearly always send old_string/new_string with bare LF
# in tool args (JSON-encoded), but the file may have CRLF on
# disk. After fuzzy_find_and_replace, ``new_content`` is a
# mixed-ending string: the substituted region is LF, surrounding
# text keeps the file's CRLF. Normalize the whole thing to the
# file's detected line ending so the on-disk file is consistent
# and the unified diff below reflects the actual change.
file_ending = _detect_line_ending(content)
if file_ending:
new_content = _normalize_line_endings(new_content, file_ending)
# Write back
write_result = self.write_file(path, new_content)
if write_result.error:

View file

@ -254,6 +254,43 @@ _file_ops_cache: dict = {}
_read_tracker_lock = threading.Lock()
_read_tracker: dict = {}
# Track consecutive patch failures per (task_id, resolved_path). Used to
# escalate the hint when the model repeatedly fails to patch the same file
# (typical cause: stale view of file contents, ambiguous old_string, or
# the file was modified externally between the agent's read and patch
# attempt). Reset on a successful patch to that path.
_patch_failure_lock = threading.Lock()
_patch_failure_tracker: dict = {} # {task_id: {resolved_path: count}}
def _record_patch_failure(task_id: str, resolved_path: str) -> int:
"""Increment and return the consecutive-failure count for this path."""
with _patch_failure_lock:
task_failures = _patch_failure_tracker.setdefault(task_id, {})
# Cap dict size per task to avoid unbounded growth in long sessions
# where the agent fails on many distinct files. 64 distinct
# failing files per task is generous; older entries get evicted.
if len(task_failures) >= 64 and resolved_path not in task_failures:
try:
first_key = next(iter(task_failures))
del task_failures[first_key]
except StopIteration:
pass
task_failures[resolved_path] = task_failures.get(resolved_path, 0) + 1
return task_failures[resolved_path]
def _reset_patch_failures(task_id: str, resolved_paths: list) -> None:
"""Clear consecutive-failure counts for the given paths."""
if not resolved_paths:
return
with _patch_failure_lock:
task_failures = _patch_failure_tracker.get(task_id)
if not task_failures:
return
for rp in resolved_paths:
task_failures.pop(rp, None)
# Per-task bounds for the containers inside each _read_tracker[task_id].
# A CLI session uses one stable task_id for its lifetime; without these
# caps, a 10k-read session would accumulate ~1.5MB of dict/set state that
@ -1020,12 +1057,43 @@ def patch_tool(mode: str = "replace", path: str = None, old_string: str = None,
_r = _path_to_resolved.get(_p)
if _r:
file_state.note_write(task_id, _r)
# Successful patch: clear any prior consecutive-failure
# counters for the touched paths so a future failure on
# the same path starts the escalation cycle fresh.
_reset_patch_failures(task_id, [
_r for _r in (_path_to_resolved.get(_p) for _p in _paths_to_check) if _r
])
# Hint when old_string not found — saves iterations where the agent
# retries with stale content instead of re-reading the file.
# Suppressed when patch_replace already attached a rich "Did you mean?"
# snippet (which is strictly more useful than the generic hint).
if result_dict.get("error") and "Could not find" in str(result_dict["error"]):
if "Did you mean one of these sections?" not in str(result_dict["error"]):
# Track per-file consecutive failures for replace mode. The
# ``path`` arg only exists for replace mode; for V4A patches
# we'd need to walk the headers, but in practice V4A failures
# are far rarer and the existing _hint covers them adequately.
failure_count = 0
if mode == "replace" and path:
resolved = _path_to_resolved.get(path) or path
failure_count = _record_patch_failure(task_id, resolved)
if failure_count >= 3:
# Escalating hint after multiple consecutive failures on the
# same path. Most common cause is a stale view of the file —
# the model is retrying with the same old_string against
# content that has since changed. Surface the failure count
# so the model recognises it's in a loop and breaks out by
# re-reading or falling back to write_file.
result_dict["_hint"] = (
f"This is failure #{failure_count} patching {path!r}. "
"Stop retrying with variations of the same old_string. "
"Either: (1) re-read the file fresh to verify current "
"content, (2) use a longer / more unique old_string with "
"surrounding context lines, or (3) use write_file to "
"replace the entire file if the targeted region is hard "
"to anchor."
)
elif "Did you mean one of these sections?" not in str(result_dict["error"]):
result_dict["_hint"] = (
"old_string not found. Use read_file to verify the current "
"content, or search_files to locate the text."

View file

@ -108,8 +108,15 @@ def fuzzy_find_and_replace(content: str, old_string: str, new_string: str,
if drift_err:
return content, 0, None, drift_err
# Perform replacement
new_content = _apply_replacements(content, matches, new_string)
# Perform replacement. When the matched strategy is NOT `exact`,
# the file's indentation may differ from what the LLM sent in
# old_string/new_string — e.g. LLM used 2-space indent but the
# file is 4-space. Shift new_string by the indentation delta so
# the replacement matches the file's actual indent pattern.
new_content = _apply_replacements(
content, matches, new_string,
old_string=old_string if strategy_name != "exact" else None,
)
return new_content, len(matches), strategy_name, None
# No strategy found a match
@ -156,26 +163,119 @@ def _detect_escape_drift(content: str, matches: List[Tuple[int, int]],
return None
def _apply_replacements(content: str, matches: List[Tuple[int, int]], new_string: str) -> str:
def _leading_whitespace(line: str) -> str:
"""Return the leading whitespace prefix of a line (spaces/tabs)."""
i = 0
while i < len(line) and line[i] in (" ", "\t"):
i += 1
return line[:i]
def _first_meaningful_line(text: str) -> Optional[str]:
"""Return the first line of ``text`` that has any non-whitespace content.
Returns ``None`` if no such line exists (text is empty or all whitespace).
"""
for line in text.split("\n"):
if line.strip():
return line
return None
def _reindent_replacement(file_region: str, old_string: str, new_string: str) -> str:
"""Adjust ``new_string`` so its indentation matches ``file_region``.
Used after a non-exact fuzzy match: the LLM may have sent old_string and
new_string with a different indent than the file actually has (e.g.
2-space indent in tool args vs 4-space indent on disk). The fuzzy
strategy successfully matched anyway, but writing ``new_string`` verbatim
would corrupt the file's indentation.
Approach:
1. For each non-blank line in ``new_string``, compute its indent
*relative* to the shallowest non-blank line of ``old_string`` (the
LLM's base indent).
2. Anchor that relative indent onto the file's actual base indent (the
leading whitespace of the file_region's first non-blank line).
3. Re-emit each non-blank line as ``file_base + (line_indent - llm_base)``.
Blank lines and lines less-indented than the LLM's base are anchored
directly to the file's base indent.
No-op cases (returns ``new_string`` unchanged):
- file_region or old_string has no meaningful line
- LLM base indent equals file base indent
- new_string is empty
"""
if not new_string:
return new_string
old_first = _first_meaningful_line(old_string)
file_first = _first_meaningful_line(file_region)
if old_first is None or file_first is None:
return new_string
old_indent = _leading_whitespace(old_first)
file_indent = _leading_whitespace(file_first)
if old_indent == file_indent:
return new_string
# Re-indent each line of new_string. Strategy: replace the LLM's base
# indent prefix with the file's base indent prefix, preserving any
# additional indent the LLM added on top. This is the same approach
# Roo Code uses (multi-search-replace.ts:466-500). It preserves the
# LLM's intended *relative* nesting between lines while anchoring to
# the file's actual indent style.
out_lines: List[str] = []
for line in new_string.split("\n"):
if not line.strip():
# Blank lines: leave whitespace untouched.
out_lines.append(line)
continue
line_indent = _leading_whitespace(line)
if line_indent.startswith(old_indent):
# Common case: line has the LLM's base indent (possibly plus
# extra). Swap base prefix for the file's base prefix.
remainder = line[len(old_indent):]
out_lines.append(file_indent + remainder)
else:
# Line is less-indented than the LLM's base — e.g. a dedent at
# the start of new_string. Anchor to the file's base.
out_lines.append(file_indent + line.lstrip(" \t"))
return "\n".join(out_lines)
def _apply_replacements(content: str, matches: List[Tuple[int, int]],
new_string: str, old_string: Optional[str] = None) -> str:
"""
Apply replacements at the given positions.
Args:
content: Original content
matches: List of (start, end) positions to replace
new_string: Replacement text
old_string: When non-None, signals that the match came from a
non-exact fuzzy strategy; ``new_string`` is re-indented to
match the file's actual indentation before substitution.
Returns:
Content with replacements applied
"""
# Sort matches by position (descending) to replace from end to start
# This preserves positions of earlier matches
sorted_matches = sorted(matches, key=lambda x: x[0], reverse=True)
result = content
for start, end in sorted_matches:
result = result[:start] + new_string + result[end:]
if old_string is not None:
file_region = content[start:end]
adjusted = _reindent_replacement(file_region, old_string, new_string)
else:
adjusted = new_string
result = result[:start] + adjusted + result[end:]
return result