hermes-agent/website/docs/developer-guide/cron-internals.md
Ben Barclay e1f4098b9f
docs(cron): document explicit per-channel delivery targets for all platforms (#54630)
The cron delivery table only showed Discord/Telegram with explicit
target syntax and described Slack and every other platform as
home-channel-only. In fact the generic platform:<target> routing in
_resolve_single_delivery_target resolves explicit targets for every
platform: Slack (#channel / channel ID / channel:thread_ts), Matrix
(room/user IDs), Feishu (chat:thread), WhatsApp (JID / E.164), Signal
(group / E.164), SMS, Email, and Weixin all have dedicated explicit-
target branches in _parse_target_ref; the remaining platforms accept a
generic platform:<chat_id> passthrough.

Update the Delivery Model table (en + zh-Hans) to show the real
per-platform syntax, document #channel name resolution via the channel
directory, and note the Slack thread_ts nuance. Docs-only.
2026-06-29 15:23:16 +10:00

14 KiB

sidebar_position title description
11 Cron Internals How Hermes stores, schedules, edits, pauses, skill-loads, and delivers cron jobs

Cron Internals

The cron subsystem provides scheduled task execution — from simple one-shot delays to recurring cron-expression jobs with skill injection and cross-platform delivery.

Key Files

File Purpose
cron/jobs.py Job model, storage, atomic read/write to jobs.json
cron/scheduler.py Scheduler loop — due-job detection, execution, repeat tracking
tools/cronjob_tools.py Model-facing cronjob tool registration and handler
gateway/run.py Gateway integration — cron ticking in the long-running loop
hermes_cli/cron.py CLI hermes cron subcommands

Scheduling Model

Four schedule formats are supported:

Format Example Behavior
Relative delay 30m, 2h, 1d One-shot, fires after the specified duration
Interval every 2h, every 30m Recurring, fires at regular intervals
Cron expression 0 9 * * * Standard 5-field cron syntax (minute, hour, day, month, weekday)
ISO timestamp 2025-01-15T09:00:00 One-shot, fires at the exact time

The model-facing surface is a single cronjob tool with action-style operations: create, list, update, pause, resume, run, remove.

Job Storage

Jobs are stored in ~/.hermes/cron/jobs.json with atomic write semantics (write to temp file, then rename). Each job record contains:

{
  "id": "a1b2c3d4e5f6",
  "name": "Daily briefing",
  "prompt": "Summarize today's AI news and funding rounds",
  "schedule": {
    "kind": "cron",
    "expr": "0 9 * * *",
    "display": "0 9 * * *"
  },
  "skills": ["ai-funding-daily-report"],
  "deliver": "telegram:-1001234567890",
  "repeat": {
    "times": null,
    "completed": 42
  },
  "state": "scheduled",
  "enabled": true,
  "next_run_at": "2025-01-16T09:00:00Z",
  "last_run_at": "2025-01-15T09:00:00Z",
  "last_status": "ok",
  "created_at": "2025-01-01T00:00:00Z",
  "model": null,
  "provider": null,
  "script": null
}

Job Lifecycle States

State Meaning
scheduled Active, will fire at next scheduled time
paused Suspended — won't fire until resumed
completed Repeat count exhausted or one-shot that has fired
running Currently executing (transient state)

Backward Compatibility

Older jobs may have a single skill field instead of the skills array. The scheduler normalizes this at load time — single skill is promoted to skills: [skill].

Scheduler Runtime

Tick Cycle

The scheduler runs on a periodic tick (default: every 60 seconds):

tick()
  1. Acquire scheduler lock (prevents overlapping ticks)
  2. Load all jobs from jobs.json
  3. Filter to due jobs (next_run <= now AND state == "scheduled")
  4. For each due job:
     a. Set state to "running"
     b. Create fresh AIAgent session (no conversation history)
     c. Load attached skills in order (injected as user messages)
     d. Run the job prompt through the agent
     e. Deliver the response to the configured target
     f. Update run_count, compute next_run
     g. If repeat count exhausted → state = "completed"
     h. Otherwise → state = "scheduled"
  5. Write updated jobs back to jobs.json
  6. Release scheduler lock

Gateway Integration

In gateway mode, the cron trigger (the part that decides when a due job fires — "Axis B") is selected through a pluggable CronScheduler provider. The gateway calls resolve_cron_scheduler() (cron/scheduler_provider.py) and runs the resolved provider's start() in a dedicated background thread, alongside a separate gateway-housekeeping thread.

The active provider is chosen by the cron.provider config key:

  • empty (default) → the built-in InProcessCronScheduler, which runs the historical in-process loop calling scheduler.tick() every 60 seconds. This is byte-identical to the pre-provider behavior.
  • a named provider (e.g. chronos, a managed-cron provider for scale-to-zero deployments) → discovered from plugins/cron/<name>/ or $HERMES_HOME/plugins/<name>/.

If a named provider is missing, fails to load, or reports is_available() == False, the resolver falls back to the built-in with a warning — cron is never left without a trigger. The built-in provider lives in core (cron/scheduler_provider.py), not in plugins/, so the fallback can't be accidentally removed.

What "firing" means (job execution + delivery) is unchanged and shared by all providers — it stays in scheduler.run_job() / scheduler._deliver_result(). A provider only controls the trigger, never execution.

In CLI mode, cron jobs only fire when hermes cron commands are run or during active CLI sessions.

Managed cron (Chronos) for scale-to-zero

Hosted gateways can run the Chronos provider (cron.provider: chronos) instead of the built-in ticker. Chronos lets an idle gateway scale to zero and still fire cron jobs: rather than a 60-second in-process loop (which would keep the process awake), it asks Nous infrastructure to arm exactly one managed one-shot per job at that job's real next-fire time. At fire time Nous calls the gateway back over an authenticated webhook (POST /api/cron/fire); the gateway runs the job through the same run_one_job path as the built-in, then re-arms the next one-shot. Between fires the process can be fully stopped — it wakes only on a genuine fire, never on a periodic timer.

The flow (the managed scheduler is provided by Nous; the agent holds no scheduler credentials):

create/update a cron job
  → Chronos asks Nous to arm a one-shot at the job's next_run_at
      (authenticated with the agent's existing Nous token)
  → at fire time Nous calls the gateway: POST {callback_url}/api/cron/fire
      (authenticated with a short-lived, purpose-scoped Nous-minted JWT)
  → the gateway verifies the token, claims the job (store compare-and-set so
    multi-replica deployments fire at-most-once), runs it, and re-arms the next
    one-shot

Config (all non-secret; on hosted agents Nous sets these at provision time):

key meaning
cron.provider chronos to activate (empty = built-in ticker)
cron.chronos.portal_url Nous base URL (arming + the fire-token issuer)
cron.chronos.callback_url the gateway's own public base URL for inbound fires
cron.chronos.expected_audience this agent's fire-token audience
cron.chronos.nas_jwks_url key set for verifying the inbound fire token

If Chronos is misconfigured or the agent isn't logged into Nous, resolve_cron_scheduler() falls back to the built-in ticker (logged warning) — cron never loses its trigger. Recurring jobs re-arm after each fire; repeat-N jobs stop cleanly when the count is exhausted (no orphaned one-shot). The full agent↔Nous wire contract lives in docs/chronos-managed-cron-contract.md.

Fresh Session Isolation

Each cron job runs in a completely fresh agent session:

  • No conversation history from previous runs
  • No memory of previous cron executions (unless persisted to memory/files)
  • The prompt must be self-contained — cron jobs cannot ask clarifying questions
  • The cronjob toolset is disabled (recursion guard)

Skill-Backed Jobs

A cron job can attach one or more skills via the skills field. At execution time:

  1. Skills are loaded in the specified order
  2. Each skill's SKILL.md content is injected as context
  3. The job's prompt is appended as the task instruction
  4. The agent processes the combined skill context + prompt

This enables reusable, tested workflows without pasting full instructions into cron prompts. For example:

Create a daily funding report → attach "ai-funding-daily-report" skill

Script-Backed Jobs

Jobs can also attach a Python script via the script field. The script runs before each agent turn, and its stdout is injected into the prompt as context. This enables data collection and change detection patterns:

# ~/.hermes/scripts/check_competitors.py
import requests, json
# Fetch competitor release notes, diff against last run
# Print summary to stdout — agent analyzes and reports

The script timeout defaults to 120 seconds. _get_script_timeout() resolves the limit through a three-layer chain:

  1. Module-level override_SCRIPT_TIMEOUT (for tests/monkeypatching). Only used when it differs from the default.
  2. Environment variableHERMES_CRON_SCRIPT_TIMEOUT
  3. Configcron.script_timeout_seconds in config.yaml (read via load_config())
  4. Default — 120 seconds

Provider Recovery

run_job() passes the user's configured fallback providers and credential pool into the AIAgent instance:

  • Fallback providers — reads fallback_providers (list) or fallback_model (legacy dict) from config.yaml, matching the gateway's _load_fallback_model() pattern. Passed as fallback_model= to AIAgent.__init__, which normalizes both formats into a fallback chain.
  • Credential pool — loads via load_pool(provider) from agent.credential_pool using the resolved runtime provider name. Only passed when the pool has credentials (pool.has_credentials()). Enables same-provider key rotation on 429/rate-limit errors.

This mirrors the gateway's behavior — without it, cron agents would fail on rate limits without attempting recovery.

Delivery Model

Cron job results can be delivered to any supported platform.

A bare platform name (slack, telegram, …) delivers to that platform's configured home channel. To target a specific destination instead, append a target after a colon: platform:<target>. The target is resolved at fire time (not when the job is created), so a job can name a destination on a platform that isn't connected yet and start delivering once it comes online.

Most platforms also accept an optional thread/topic as a third segment: platform:<chat_id>:<thread_id>.

Target Syntax Example
Origin chat origin Deliver to the chat where the job was created
Local file local Save to ~/.hermes/cron/output/
Telegram telegram, telegram:<chat_id>, telegram:<chat_id>:<thread_id>, telegram:@username telegram:-1001234567890:17585
Discord discord, discord:#channel, discord:<channel_id>, discord:<channel_id>:<thread_id> discord:#engineering
Slack slack, slack:#channel, slack:<channel_id>, slack:<channel_id>:<thread_ts> slack:#engineering
Matrix matrix, matrix:<!room_id:server>, matrix:<@user:server> matrix:!abc123:example.org
Feishu feishu, feishu:<chat_id>, feishu:<chat_id>:<thread_id> feishu:oc_abc123def
WhatsApp whatsapp, whatsapp:<jid>, whatsapp:+<E.164> whatsapp:123456@g.us
Signal signal, signal:group:<id>, signal:+<E.164> signal:group:aBcD==
SMS sms, sms:+<E.164> sms:+<E.164 number>
Email email, email:<address> email:alerts@example.com
Weixin weixin, weixin:<wxid> weixin:wxid_abc123
Mattermost mattermost or mattermost:<channel_id> Bare name delivers to Mattermost home
Home Assistant homeassistant or homeassistant:<conversation> Bare name delivers to HA conversation
DingTalk dingtalk or dingtalk:<chat_id> Bare name delivers to DingTalk
WeCom wecom or wecom:<chat_id> Bare name delivers to WeCom
BlueBubbles bluebubbles or bluebubbles:<chat_guid> Bare name delivers to iMessage via BlueBubbles
QQ Bot qqbot or qqbot:<chat_id> Bare name delivers to QQ (Tencent) via Official API v2

Platforms in the first group have explicit, validated target syntax — named channels (#channel), topics/threads, room/user IDs, group IDs, or phone numbers. The remaining platforms accept the generic platform:<chat_id> form (the value after the colon is used verbatim as the destination ID); a bare platform name always delivers to the home channel.

Named channels (slack:#engineering, discord:#engineering, or a friendly name like slack:engineering) are resolved against the channel directory the gateway builds from connected adapters, so the gateway must have discovered the channel for name resolution to succeed; raw IDs (slack:C0123ABCD45) always work.

For Telegram topics, use telegram:<chat_id>:<thread_id> (e.g., telegram:-1001234567890:17585). For Slack threads, the third segment is the parent message's thread_ts (e.g., slack:C0123ABCD45:1700000000.000100), so it only applies when replying under an existing message.

Response Wrapping

By default (cron.wrap_response: true), cron deliveries are wrapped with:

  • A header identifying the cron job name and task
  • A footer noting the agent cannot see the delivered message in conversation

The [SILENT] prefix in a cron response suppresses delivery entirely — useful for jobs that only need to write to files or perform side effects.

Session Isolation

Cron deliveries are NOT mirrored into gateway session conversation history. They exist only in the cron job's own session. This prevents message alternation violations in the target chat's conversation.

Recursion Guard

Cron-run sessions have the cronjob toolset disabled. This prevents:

  • A scheduled job from creating new cron jobs
  • Recursive scheduling that could explode token usage
  • Accidental mutation of the job schedule from within a job

Locking

The scheduler uses cross-process file-based locking (fcntl.flock on Unix, msvcrt.locking on Windows) to prevent overlapping ticks from executing the same due-job batch twice — even between the gateway's in-process ticker and a standalone hermes cron / manual tick() call. If the lock cannot be acquired, tick() returns 0 immediately.

CLI Interface

The hermes cron CLI provides direct job management:

hermes cron list                    # Show all jobs
hermes cron create                  # Interactive job creation (alias: add)
hermes cron edit <job_id>           # Edit job configuration
hermes cron pause <job_id>          # Pause a running job
hermes cron resume <job_id>         # Resume a paused job
hermes cron run <job_id>            # Trigger immediate execution
hermes cron remove <job_id>         # Delete a job