docs: add cron script timeout and provider recovery documentation

- Add HERMES_CRON_TIMEOUT and HERMES_CRON_SCRIPT_TIMEOUT to env vars reference - Add script timeout and provider recovery sections to cron features page - Add timeout resolution chain and credential pool details to cron internals
2026-04-26 01:01:40 +00:00 · 2026-04-10 02:53:49 -07:00 · 2026-04-10 02:53:49 -07:00 · 95ee453bc0
commit 95ee453bc0
parent 38cce22e2c
3 changed files with 44 additions and 0 deletions
--- a/website/docs/developer-guide/cron-internals.md
+++ b/website/docs/developer-guide/cron-internals.md
@ -132,6 +132,22 @@ import requests, json
 # Print summary to stdout — agent analyzes and reports
 ```

+The script timeout defaults to 120 seconds. `_get_script_timeout()` resolves the limit through a three-layer chain:
+
+1. **Module-level override** — `_SCRIPT_TIMEOUT` (for tests/monkeypatching). Only used when it differs from the default.
+2. **Environment variable** — `HERMES_CRON_SCRIPT_TIMEOUT`
+3. **Config** — `cron.script_timeout_seconds` in `config.yaml` (read via `load_config()`)
+4. **Default** — 120 seconds
+
+### Provider Recovery
+
+`run_job()` passes the user's configured fallback providers and credential pool into the `AIAgent` instance:
+
+- **Fallback providers** — reads `fallback_providers` (list) or `fallback_model` (legacy dict) from `config.yaml`, matching the gateway's `_load_fallback_model()` pattern. Passed as `fallback_model=` to `AIAgent.__init__`, which normalizes both formats into a fallback chain.
+- **Credential pool** — loads via `load_pool(provider)` from `agent.credential_pool` using the resolved runtime provider name. Only passed when the pool has credentials (`pool.has_credentials()`). Enables same-provider key rotation on 429/rate-limit errors.
+
+This mirrors the gateway's behavior — without it, cron agents would fail on rate limits without attempting recovery.
+
 ## Delivery Model

 Cron job results can be delivered to any supported platform:
--- a/website/docs/reference/environment-variables.md
+++ b/website/docs/reference/environment-variables.md
@ -285,6 +285,13 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI
 | `HERMES_BACKGROUND_NOTIFICATIONS` | Background process notification mode in gateway: `all` (default), `result`, `error`, `off` |
 | `HERMES_EPHEMERAL_SYSTEM_PROMPT` | Ephemeral system prompt injected at API-call time (never persisted to sessions) |

+## Cron Scheduler
+
+| Variable | Description |
+|----------|-------------|
+| `HERMES_CRON_TIMEOUT` | Inactivity timeout for cron job agent runs in seconds (default: `600`). The agent can run indefinitely while actively calling tools or receiving stream tokens — this only triggers when idle. Set to `0` for unlimited. |
+| `HERMES_CRON_SCRIPT_TIMEOUT` | Timeout for pre-run scripts attached to cron jobs in seconds (default: `120`). Override for scripts that need longer execution (e.g., randomized delays for anti-bot timing). Also configurable via `cron.script_timeout_seconds` in `config.yaml`. |
+
 ## Session Settings

 | Variable | Description |
--- a/website/docs/user-guide/features/cron.md
+++ b/website/docs/user-guide/features/cron.md
@ -240,6 +240,27 @@ Otherwise, report the issue.

 Failed jobs always deliver regardless of the `[SILENT]` marker — only successful runs can be silenced.

+## Script timeout
+
+Pre-run scripts (attached via the `script` parameter) have a default timeout of 120 seconds. If your scripts need longer — for example, to include randomized delays that avoid bot-like timing patterns — you can increase this:
+
+```yaml
+# ~/.hermes/config.yaml
+cron:
+  script_timeout_seconds: 300   # 5 minutes
+```
+
+Or set the `HERMES_CRON_SCRIPT_TIMEOUT` environment variable. The resolution order is: env var → config.yaml → 120s default.
+
+## Provider recovery
+
+Cron jobs inherit your configured fallback providers and credential pool rotation. If the primary API key is rate-limited or the provider returns an error, the cron agent can:
+
+- **Fall back to an alternate provider** if you have `fallback_providers` (or the legacy `fallback_model`) configured in `config.yaml`
+- **Rotate to the next credential** in your [credential pool](/docs/user-guide/configuration#credential-pool-strategies) for the same provider
+
+This means cron jobs that run at high frequency or during peak hours are more resilient — a single rate-limited key won't fail the entire run.
+
 ## Schedule formats

 The agent's final response is automatically delivered — you do **not** need to include `send_message` in the cron prompt for that same destination. If a cron run calls `send_message` to the exact target the scheduler will already deliver to, Hermes skips that duplicate send and tells the model to put the user-facing content in the final response instead. Use `send_message` only for additional or different targets.