hermes-agent/tests/cron
liuhao1024 6777a6bd67 fix(cron): run missed-grace jobs once instead of deferring forever
When a recurring job's execution time exceeds `interval + grace`, the
scheduler entered a perpetual "missed → fast-forward → skip" loop and the
job effectively never ran again. A real job (`hermes-upstream-contribution`)
logged 42 consecutive "missed" events over 9 hours without executing once.

Timeline (5-min interval, 150s grace, ~15-min execution):
  14:00 due → advance next_run_at→14:05 → run (blocks 15 min)
  14:15 finishes
  14:16 tick: next_run_at=14:05, elapsed 660s > grace 150s → "missed!"
        → fast-forward to 14:21 → continue (SKIP) → does NOT run
  ... repeats forever for any job whose runtime > interval+grace.

The `continue` (skip execution) in `_get_due_jobs_locked` was designed to
prevent burst-catchup after *gateway downtime* — don't run 6 missed
instances of a 30-min job on restart. But it wrongly applied to a job that
missed its slot because it was *still running*, not because the gateway was
down.

Fix: keep the fast-forward (so accumulated missed slots are still collapsed
to a single next slot — no burst) but fall through to `due.append(job)` so
the job runs ONCE now. The log message is updated to be honest about the new
behavior ("Running now; next run fast-forwarded to: ...").

Behavior note: a recurring job missed during gateway downtime now also fires
once immediately on restart (rather than waiting for its next natural slot).
This is the intended trade-off — the same "run once, don't burst" rule now
applies uniformly to both downtime-misses and long-execution-misses.

Salvaged from #33318 by @liuhao1024 (authorship preserved). Also addresses
the diagnosis in #33361 (@agent-trivi), which proposed the same one-line fix.

Tests: updates `test_stale_past_due_skipped` →
`test_stale_past_due_runs_once_and_fast_forwards` (the old test encoded the
skip behavior); adds `test_long_execution_does_not_perpetually_defer` as a
direct regression for the production loop; updates the F2e timezone test that
relied on the old skip path. Full tests/cron/ suite: 510 passed.

Fixes #33315

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
2026-06-21 14:11:12 +05:30
..
__init__.py test: add unit tests for 8 modules (batch 2) 2026-02-26 13:54:20 +03:00
conftest.py fix(cron): resolve model.default + fail fast on missing model 2026-06-21 12:37:56 +05:30
test_blueprint_catalog.py docs: finish Automation Blueprints terminology rebrand (#44470) 2026-06-11 17:22:22 -04:00
test_claim_job_for_fire.py feat(cron): store-level CAS claim for multi-machine at-most-once fire 2026-06-18 14:34:34 +10:00
test_codex_execution_paths.py refactor(session-log): delete _save_session_log and all callers 2026-05-20 11:44:10 -07:00
test_compute_next_run_last_run_at.py fix(cron): use last_run_at as croniter base for cron jobs 2026-04-29 08:24:48 -07:00
test_cron_context_from.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_cron_inactivity_timeout.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_cron_no_agent.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_cron_prompt_injection_skill.py fix(cron): don't strict-scan script-injected output in no-skills jobs (#43223) 2026-06-10 08:27:24 +05:30
test_cron_script.py test(cron): make env-sanitize probe var deterministic 2026-06-20 00:22:55 +05:30
test_cron_workdir.py fix(cron): make sequential jobs non-blocking too + sweep MCP after jobs finish 2026-06-04 05:40:13 -07:00
test_cronjob_schema.py test(cron): guard schedule-required description text on CRONJOB_SCHEMA 2026-05-26 14:09:37 -07:00
test_file_permissions.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_jobs.py fix(cron): run missed-grace jobs once instead of deferring forever 2026-06-21 14:11:12 +05:30
test_jobs_changed_notify.py feat(cron): wire on_jobs_changed, cron.chronos config, docs + agent↔NAS contract 2026-06-18 15:11:32 +10:00
test_jobs_crossprocess_lock.py fix: complete cron jobs lock salvage 2026-06-15 06:29:00 -07:00
test_parallel_pool.py revert(cron): remove per-job profile support (PR #28124) (#43956) 2026-06-10 20:46:17 -07:00
test_rewrite_skill_refs.py fix(curator): rewrite cron job skill refs after consolidation (#18253) 2026-04-30 23:04:50 -07:00
test_run_one_job.py refactor(cron): extract run_one_job shared firing helper from tick 2026-06-18 14:26:29 +10:00
test_scheduler.py fix(cron): route Telegram DM-topic cron delivery through DeliveryRouter (#22773) 2026-06-21 13:35:45 +05:30
test_scheduler_mcp_init.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_scheduler_provider.py fix(cron): keep ticker alive on BaseException + heartbeat-aware status 2026-06-21 13:00:50 +05:30
test_suggestions.py test(cron): document consent-first self-learning suggestions 2026-06-20 23:23:47 -07:00