* feat(search_files): path-grouped lossless densification of content matches
Content-mode search_files results repeat the {path,line,content} JSON keys
and the full path string for every match. Group consecutive same-path matches
under one path header with indented '<line>: <content>' rows — lossless (every
path/line/content byte preserved), self-describing (matches_format key), and
readable by the model with no decode step.
57.8% mean token reduction on real search_files content outputs (422-output
corpus), fires on 97% of them. Gated at >=5 matches; below that the verbose
array is left untouched. Default to_dict(densify=False) is unchanged, so no
other caller is affected.
ripgrep emits matches path-ordered, so consecutive grouping never reorders
results.
* test: accept densify kwarg in _FakeSearchResult.to_dict
The search loop-detection tests stub SearchResult with a fake whose
to_dict() must mirror the real signature now that it takes densify=.
* test(search_files): edge-case losslessness battery for densification
Adversarial single-line content (colons, indentation, unicode/emoji, empty,
trailing whitespace, quotes+commas), paths with spaces, and an explicit
one-line-per-match invariant documenting the ripgrep contract the format
relies on (0/6775 real match contents contained a newline).
Remove the [Files already read — do NOT re-read these] user message
that was injected into the conversation after context compression.
This message used role='user' for system-generated content, creating
a fake user turn that confused models about conversation state and
could contribute to task-redo behavior.
The file_tools.py read tracker (warn on 3rd consecutive read, block
on 4th+) already handles re-read prevention inline without injecting
synthetic messages.
Closes#2224.
Co-authored-by: Test <test@test.com>
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
_FakeReadResult and _FakeSearchResult now expose the attributes
that read_file_tool/search_tool access after the redact_sensitive_text
integration from main.
- Block file reads after 3+ re-reads of same region (no content returned)
- Track search_files calls and block repeated identical searches
- Filter completed/cancelled todos from post-compression injection
to prevent agent from re-doing finished work
- Add 10 new tests covering all three fixes
When context compression summarizes conversation history, the agent
loses track of which files it already read and re-reads them in a loop.
Users report the agent reading the same files endlessly without writing.
Root cause: context compression is lossy — file contents and read history
are lost in the summary. After compression, the model thinks it hasn't
examined the files yet and reads them again.
Fix (two-part):
1. Track file reads per task in file_tools.py. When the same file region
is read again, include a _warning in the response telling the model
to stop re-reading and use existing information.
2. After context compression, inject a structured message listing all
files already read in the session with explicit "do NOT re-read"
instruction, preserving read history across compression boundaries.
Adds 16 tests covering warning detection, task isolation, summary
accuracy, tracker cleanup, and compression history injection.