fix: skip stale cron jobs on gateway restart instead of firing immediately

When the gateway restarts after being down past a scheduled run time,
recurring jobs (cron/interval) were firing immediately because their
next_run_at was in the past. Now jobs more than 2 minutes late are
fast-forwarded to the next future occurrence instead.

- get_due_jobs() checks staleness for cron/interval jobs
- Stale jobs get next_run_at recomputed and saved
- Jobs within 2 minutes of their schedule still fire normally
- One-shot (once) jobs are unaffected — they fire if missed

Fixes the 'cron jobs run on every gateway restart' issue.
This commit is contained in:
teknium1 2026-03-16 23:48:13 -07:00
parent e3f9894caf
commit 4768ea624d
3 changed files with 64 additions and 7 deletions

View file

@ -241,7 +241,7 @@ class TestCronTimezone:
job = create_job(prompt="Test job", schedule="every 1h")
jobs = load_jobs()
# Force a naive (no timezone) past timestamp
naive_past = (datetime.now() - timedelta(minutes=5)).isoformat()
naive_past = (datetime.now() - timedelta(seconds=30)).isoformat()
jobs[0]["next_run_at"] = naive_past
save_jobs(jobs)
@ -318,7 +318,7 @@ class TestCronTimezone:
# Simulate a naive timestamp that was written by datetime.now() on a
# system running in UTC+5:30 — 5 minutes in the past (local time)
naive_past = (datetime.now() - timedelta(minutes=5)).isoformat()
naive_past = (datetime.now() - timedelta(seconds=30)).isoformat()
jobs[0]["next_run_at"] = naive_past
save_jobs(jobs)
@ -347,7 +347,7 @@ class TestCronTimezone:
jobs = load_jobs()
# Force a naive past timestamp (system-local wall time, 10 min ago)
naive_past = (datetime.now() - timedelta(minutes=10)).isoformat()
naive_past = (datetime.now() - timedelta(seconds=30)).isoformat()
jobs[0]["next_run_at"] = naive_past
save_jobs(jobs)