fix(windows): stop spamming cwd-missing + tirith-spawn warnings on every terminal call

Two log-spam fixes surfaced by a Windows user (Git Bash + Python 3.11.9):

1. LocalEnvironment cwd warn spam
   ============================
   Git Bash's `pwd -P` emits paths like `/c/Users/x`. The base-class
   `_extract_cwd_from_output` was assigning this verbatim to `self.cwd`
   without validation, then `_resolve_safe_cwd`'s `os.path.isdir(/c/...)`
   returned False on Windows, triggering:

       LocalEnvironment cwd '/c/Users/NVIDIA' is missing on disk;
       falling back to '/' so terminal commands keep working.

   ...on every terminal call. The pre-existing Windows-path translation
   inside `_run_bash` ran AFTER the safe-cwd check, so it could never
   prevent the warning.

   Fix:
   - New `_msys_to_windows_path` helper (idempotent, no-op off Windows).
   - `_resolve_safe_cwd` normalizes before `isdir`, so a valid MSYS path
     is recognized as the real directory it points at.
   - `LocalEnvironment._update_cwd` and a new override of
     `_extract_cwd_from_output` translate + validate before mutating
     `self.cwd`. Stale / non-existent marker paths roll back to the
     previous cwd instead of clobbering it.
   - The fallback warning still fires when the directory really is gone
     (deletion-recovery scenario from #17558 still covered).

2. tirith spawn-failed warn spam
   =============================
   When tirith isn't installed (background install in flight, or marked
   failed for the day) and the configured path stays as the bare string
   `tirith`, every `subprocess.run([tirith_path, ...])` raises OSError
   and logged:

       tirith spawn failed: [WinError 2] The system cannot find the file specified

   ...on every command. fail_open=True means behaviour is correct, but
   the log noise is severe.

   Fix:
   - `_warn_once(key, ...)` thread-safe dedupe helper.
   - Three hot-path warnings (`tirith path resolved to None`,
     `tirith spawn failed: ...`, `tirith timed out after Ns`) now log
     once per (exception class, errno) / timeout-value / path-none key.
   - Dedupe set is cleared on `_clear_install_failed` so a successful
     install lets a subsequent failure surface again.

Tests
=====
- `tests/tools/test_local_env_windows_msys.py`: 12 tests covering the
  MSYS→Windows translator, the resolve fast-path, update_cwd validation,
  and extract_cwd_from_output rollback.
- `tests/tools/test_tirith_security.py`: 4 new dedupe tests (15 spawn
  failures → 1 log line; distinct exc types → 2 lines; timeout dedupe;
  path-None dedupe).

Targeted runs:
  test_local_env_windows_msys.py      12 passed
  test_local_env_cwd_recovery.py       7 passed (pre-existing, no regressions)
  test_tirith_security.py             67 passed (63 pre-existing + 4 new)
  test_base_environment + local_*    37 passed (no regressions)
  test_local_env_blocklist + neighbours  114 passed

Reported via Hermes log capture: 19× cwd warnings + 15× tirith warnings
in a single short session.
This commit is contained in:
teknium1 2026-05-15 14:58:28 -07:00 committed by Teknium
parent 7fee1f61eb
commit 4aec25bc44
4 changed files with 441 additions and 14 deletions

View file

@ -101,6 +101,34 @@ _install_failure_reason: str = "" # reason tag when _resolved_path is _INSTALL_
_install_lock = threading.Lock()
_install_thread: threading.Thread | None = None
# Warning de-duplication. The spawn/path warnings live in the hot path —
# without this dedupe set, a Windows install where ``tirith`` isn't on PATH
# (e.g. background install thread still running, or install marked failed)
# spams ``tirith spawn failed: [WinError 2]...`` once per terminal command,
# easily filling errors.log with hundreds of identical lines.
_warned_messages: set[str] = set()
_warned_lock = threading.Lock()
def _warn_once(key: str, message: str, *args) -> None:
"""``logger.warning`` but at-most-once per ``key`` for the process
lifetime. Used to avoid drowning the log when a fail-open tirith
misconfiguration fires on every command."""
with _warned_lock:
if key in _warned_messages:
return
_warned_messages.add(key)
logger.warning(message, *args)
def _reset_spawn_warning_state() -> None:
"""Clear the warn-once dedupe set. Called when tirith is freshly
(re)installed so a subsequent failure surfaces again e.g. user
deletes the binary mid-session.
"""
with _warned_lock:
_warned_messages.clear()
# Disk-persistent failure marker — avoids retry across process restarts
_MARKER_TTL = 86400 # 24 hours
@ -168,6 +196,10 @@ def _mark_install_failed(reason: str = ""):
def _clear_install_failed():
"""Remove the failure marker after successful install."""
# Reset the warn-once dedupe set so a subsequent failure (e.g. user
# deletes the binary) surfaces in the log again instead of being
# silently suppressed by a stale dedupe key from before the fix.
_reset_spawn_warning_state()
try:
os.unlink(_failure_marker_path())
except OSError:
@ -632,7 +664,10 @@ def check_command_security(command: str) -> dict:
fail_open = cfg["tirith_fail_open"]
if tirith_path is None:
logger.warning("tirith path resolved to None; scanning disabled")
_warn_once(
"tirith_path_none",
"tirith path resolved to None; scanning disabled",
)
if fail_open:
return {"action": "allow", "findings": [], "summary": "tirith path unavailable"}
return {"action": "block", "findings": [], "summary": "tirith path unavailable (fail-closed)"}
@ -646,13 +681,23 @@ def check_command_security(command: str) -> dict:
timeout=timeout,
)
except OSError as exc:
# Covers FileNotFoundError, PermissionError, exec format error
logger.warning("tirith spawn failed: %s", exc)
# Covers FileNotFoundError, PermissionError, exec format error.
# Dedupe by ``(errno, exc class)`` so a transient failure mode
# surfaces once but doesn't drown the log on every command —
# commonly seen on Windows when the configured path "tirith"
# isn't on PATH yet (background install still running, or
# install marked failed for the day).
spawn_key = f"tirith_spawn_failed:{type(exc).__name__}:{getattr(exc, 'errno', '')}"
_warn_once(spawn_key, "tirith spawn failed: %s", exc)
if fail_open:
return {"action": "allow", "findings": [], "summary": f"tirith unavailable: {exc}"}
return {"action": "block", "findings": [], "summary": f"tirith spawn failed (fail-closed): {exc}"}
except subprocess.TimeoutExpired:
logger.warning("tirith timed out after %ds", timeout)
_warn_once(
f"tirith_timeout:{timeout}",
"tirith timed out after %ds",
timeout,
)
if fail_open:
return {"action": "allow", "findings": [], "summary": f"tirith timed out ({timeout}s)"}
return {"action": "block", "findings": [], "summary": "tirith timed out (fail-closed)"}