hermes-agent/cron
CiarasClaws e5b4cf7bea fix(cron): make jobs.json writes safe across processes
`hermes cron pause`/`resume`/`remove` run in their own CLI process (CLI →
cronjob tool → pause_job → update_job → save_jobs), entirely separate from
the gateway process that also writes jobs.json (mark_job_run, advance_next_run,
due-fast-forward in get_due_jobs). The only synchronization was a module-level
`threading.Lock`, which serializes writers *within a single process* but does
nothing across processes — and update_job/pause_job/remove_job/create_job did
not even take it.

The result is a classic lost update: a `cron pause` issued while the gateway is
live loads jobs.json, sets enabled=False, and saves; concurrently the gateway
loads the same file and saves back its run-bookkeeping, clobbering the pause.
The CLI prints "Paused" (it succeeded against its own in-memory copy) but the
job stays enabled and keeps firing, with no error surfaced. The scheduler's
`.tick.lock` flock can't be reused for this — it is held for the entire tick,
including multi-minute agent runs, so a CLI mutation would block for minutes.

Add `_jobs_lock()`: a short-held cross-process advisory file lock (fcntl/msvcrt
flock on `<hermes_home>/cron/.jobs.lock`) layered over the existing in-process
lock, and wrap every load→modify→save critical section with it — create_job,
update_job, remove_job, mark_job_run, advance_next_run, get_due_jobs,
rewrite_skill_refs. The lock degrades to in-process-only if neither fcntl nor
msvcrt is available, preserving prior behaviour. All critical sections are short
(field edits, no agent execution), so contention resolves in milliseconds.

Adds a regression test that proves the lock excludes a second process (an
in-process threading.Lock cannot).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 06:29:00 -07:00
..
scripts fix(cron-recipes): pre-release hardening — honest cadences, strict slot names, surface-aware UX 2026-06-11 10:49:47 -07:00
__init__.py docs: clarify gateway service scopes (#1378) 2026-03-14 21:17:41 -07:00
blueprint_catalog.py docs: finish Automation Blueprints terminology rebrand (#44470) 2026-06-11 17:22:22 -04:00
jobs.py fix(cron): make jobs.json writes safe across processes 2026-06-15 06:29:00 -07:00
scheduler.py Merge remote-tracking branch 'origin/main' into hermes/hermes-6b48295e 2026-06-11 07:38:25 -07:00
suggestion_catalog.py fix(cron-recipes): pre-release hardening — honest cadences, strict slot names, surface-aware UX 2026-06-11 10:49:47 -07:00
suggestions.py refactor(cron): rebrand Cron Recipes -> Automation Blueprints 2026-06-11 10:49:47 -07:00