mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-07 02:51:50 +00:00
fix(cron): add concurrency regression test for parallel job state writes
get_due_jobs() called load_jobs() and save_jobs() without holding _jobs_file_lock, creating a race with the locked mark_job_run() and advance_next_run(). Wrap get_due_jobs() with the lock (delegating to a new _get_due_jobs_locked() inner function) so all load→modify→save cycles are serialised. Add two regression tests: one verifying 3 concurrent mark_job_run() calls each land their correct last_status and last_run_at without overwrites, and a stress test confirming 10 parallel calls each increment their job's completed count to exactly 1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
6875471916
commit
1c7f47a58c
2 changed files with 101 additions and 0 deletions
|
|
@ -810,6 +810,12 @@ def get_due_jobs() -> List[Dict[str, Any]]:
|
|||
the job is fast-forwarded to the next future run instead of firing
|
||||
immediately. This prevents a burst of missed jobs on gateway restart.
|
||||
"""
|
||||
with _jobs_file_lock:
|
||||
return _get_due_jobs_locked()
|
||||
|
||||
|
||||
def _get_due_jobs_locked() -> List[Dict[str, Any]]:
|
||||
"""Inner implementation of get_due_jobs(); must be called with _jobs_file_lock held."""
|
||||
now = _hermes_now()
|
||||
raw_jobs = load_jobs()
|
||||
jobs = [_apply_skill_fields(j) for j in copy.deepcopy(raw_jobs)]
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue