fix(gateway): drain manual profile gateways via SIGUSR1 before respawn

The PR wired in a detached watcher that respawns manual profile gateways
after they exit.  Pair that with a SIGUSR1 graceful drain (same path
systemd/launchd use) so in-flight agent runs finish instead of getting
SIGTERM'd.  Fall back to SIGTERM if SIGUSR1 isn't wired or the gateway
doesn't exit within the drain budget — the watcher sees the exit and
relaunches either way.

Tested end-to-end against an orphaned gateway: graceful drain exits in
0.5s and the watcher fires the relaunch command.
This commit is contained in:
Teknium 2026-04-30 19:57:42 -07:00
parent 77fe7ab6b2
commit 96691268df
2 changed files with 54 additions and 4 deletions

View file

@ -7438,13 +7438,23 @@ def _cmd_update_impl(args, gateway_mode: bool):
if proc.pid in manual_pids
}
for pid, proc in profile_processes.items():
if launch_detached_profile_gateway_restart(proc.profile, pid):
if not launch_detached_profile_gateway_restart(proc.profile, pid):
continue
# Prefer a graceful SIGUSR1 drain so in-flight agent runs
# finish before the watcher respawns the gateway. If the
# gateway doesn't support SIGUSR1 or doesn't exit within
# the drain budget, fall back to SIGTERM — the watcher
# still sees the exit and relaunches either way.
drained = _graceful_restart_via_sigusr1(
pid, drain_timeout=_drain_budget,
)
if not drained:
try:
os.kill(pid, _signal.SIGTERM)
killed_pids.add(pid)
relaunched_profiles.append(proc.profile)
except (ProcessLookupError, PermissionError):
pass
killed_pids.add(pid)
relaunched_profiles.append(proc.profile)
for pid in manual_pids:
if pid in profile_processes: