hermes-agent/website/docs
kshitijk4poor a4e61ddf04 fix(cron): fail closed when an unpinned job's provider drifts from creation snapshot (#44585)
An unpinned cron job follows the global default provider (config.yaml
model.default + resolve_runtime_provider). If that global state is changed
after the job is created — e.g. a temporary switch to a paid provider like
nous/claude-fable-5 — the job silently inherits it on its next tick and spends
real money. This is the reported $7.73 incident: a job created under a
free/default provider later inherited a temporary paid switch.

Fix (ask #1 only) preserves the legitimate "unpinned job should follow
model.default" use case by detecting *drift* rather than freezing the model:

- create_job (cron/jobs.py): for UNPINNED, agent-backed jobs (no explicit
  provider, not no_agent), snapshot the provider that resolution WOULD pick
  right now into a new optional `provider_snapshot` field, resolved via the
  same resolve_runtime_provider() path the ticker uses. Fail-open to None on
  any resolution error so job creation never breaks.

- run_job (cron/scheduler.py): right after runtime resolution, if the job has
  a provider_snapshot AND is unpinned AND the currently-resolved provider
  DIFFERS from the snapshot, fail closed for that run — make no paid call and
  deliver a loud, actionable alert naming both providers and telling the user
  to pin explicitly (`cronjob action=update job_id=.. provider=..`).

Back-compat: jobs with no snapshot (pre-existing jobs, no_agent jobs, or any
job whose creation-time resolution failed) behave exactly as before — the
guard only engages when a snapshot exists. Explicitly-pinned jobs (job.provider
set) are unaffected since they don't drift with global state.

Tests: tests/cron/test_cron_provider_pin.py covers snapshot-matches (runs),
snapshot-differs (fail closed, no agent constructed), no-snapshot back-compat,
None-snapshot back-compat, explicitly-pinned (runs regardless), plus create_job
snapshot capture/skip/fail-open. The fail-closed case is load-bearing (fails
without the guard).

Issue #44585 asks #2-4 (hard-stop a running job, gateway-stop containment,
fail-closed on provider mutation) are out of scope for this change.
2026-06-23 02:45:52 +05:30
..
developer-guide docs: repoint remaining stale gateway/platforms adapter refs to plugins/platforms 2026-06-21 19:59:50 -07:00
getting-started feat(providers): remove google-gemini-cli + google-antigravity OAuth providers (#50492) 2026-06-21 19:53:27 -07:00
guides feat(providers): remove google-gemini-cli + google-antigravity OAuth providers (#50492) 2026-06-21 19:53:27 -07:00
integrations feat(providers): remove google-gemini-cli + google-antigravity OAuth providers (#50492) 2026-06-21 19:53:27 -07:00
reference fix(security): restrict dashboard plugin backend import to bundled plugins (#43719) 2026-06-22 17:51:37 +05:30
user-guide fix(cron): fail closed when an unpinned job's provider drifts from creation snapshot (#44585) 2026-06-23 02:45:52 +05:30
index.mdx docs: point desktop download links to site root (deprecate /desktop) (#46795) 2026-06-15 15:02:24 -04:00
user-stories.mdx docs(website): add User Stories and Use Cases collage page (#18282) 2026-04-30 23:56:59 -07:00