fix(status): catch OSError in os.kill(pid, 0) for Windows compatibility

On Windows, os.kill(nonexistent_pid, 0) raises OSError with WinError 87
("The parameter is incorrect") instead of ProcessLookupError. Without
catching OSError, the acquire_scoped_lock() and get_running_pid() paths
crash on any invalid PID check — preventing gateway startup on Windows
whenever a stale PID file survives from a prior run.

Adapted @phpoh's fix in #12490 onto current main. The main file was
refactored in the interim (get_running_pid now iterates over
(primary_record, fallback_record) with a per-iteration try/except),
so the OSError catch is added as a new except clause after
PermissionError (which is a subclass of OSError, so order matters:
PermissionError must match first).

Co-authored-by: phpoh <1352808998@qq.com>
This commit is contained in:
phpoh 2026-04-23 03:04:42 -07:00 committed by Teknium
parent 51c1d2de16
commit 4c02e4597e

View file

@ -496,7 +496,8 @@ def acquire_scoped_lock(scope: str, identity: str, metadata: Optional[dict[str,
if not stale: if not stale:
try: try:
os.kill(existing_pid, 0) os.kill(existing_pid, 0)
except (ProcessLookupError, PermissionError): except (ProcessLookupError, PermissionError, OSError):
# Windows raises OSError with WinError 87 for invalid pid check
stale = True stale = True
else: else:
current_start = _get_process_start_time(existing_pid) current_start = _get_process_start_time(existing_pid)
@ -743,6 +744,10 @@ def get_running_pid(
if _record_looks_like_gateway(record): if _record_looks_like_gateway(record):
return pid return pid
continue continue
except OSError:
# Windows raises OSError with WinError 87 for an invalid pid
# (process is definitely gone). Treat as "process doesn't exist".
continue
recorded_start = record.get("start_time") recorded_start = record.get("start_time")
current_start = _get_process_start_time(pid) current_start = _get_process_start_time(pid)