hermes-agent/website/docs/guides/cron-troubleshooting.md
Teknium fef1a41248
docs: round 2 audit — messaging, developer-guide, guides, integrations (#22858)
Cross-checked 75 docs pages under user-guide/messaging/, developer-guide/,
guides/, and integrations/ against the live registries and gateway code.

messaging/
- index.md: API Server toolset is hermes-api-server (was 'hermes (default)');
  Google Chat slug is hermes-google_chat (underscore — plugin name uses _).
- google_chat.md: drop bogus 'pip install hermes-agent[google_chat]' (no such
  extra); list the actual deps (google-cloud-pubsub, google-api-python-client,
  google-auth, google-auth-oauthlib).
- qqbot.md: config namespace is platforms.qqbot (was platforms.qq, which is
  silently ignored by the adapter); QQ_STT_BASE_URL is not read directly —
  baseUrl lives under platforms.qqbot.extra.stt.
- teams-meetings.md: 'hermes teams-pipeline' is plugin-gated (teams_pipeline
  plugin must be enabled), not a built-in subcommand.
- sms.md: example log line 0.0.0.0:8080 -> 127.0.0.1:8080 (default
  SMS_WEBHOOK_HOST).
- open-webui.md: API_SERVER_* are env vars, not YAML keys — write them to
  per-profile .env, not 'hermes config set' (same pattern fixed in
  api-server.md last round). Also bumped example ports to 8650+ to dodge the
  default webhook (8644)/wecom-callback (8645)/msgraph-webhook (8646)
  collision.

developer-guide/
- architecture.md: tool/toolset counts (61/52 -> 70+/~28); LOC stamps for
  run_agent.py, cli.py, hermes_cli/main.py, setup.py, mcp_tool.py,
  gateway/run.py replaced with 'large file' to stop drifting.
- agent-loop.md: same LOC drift (~13,700 -> 'a large file (15k+ lines)').
- gateway-internals.md: '14+ external messaging platforms' -> '20+'; gateway
  platform tree updated (qqbot is a sub-package, not qqbot.py; added
  yuanbao.py, feishu_comment.py, msgraph_webhook.py); 'gateway/builtin_hooks/
  (always active)' was wrong — it's an empty extension point and
  _register_builtin_hooks() is a no-op stub.
- acp-internals.md: drop fictional 'message_callback' from the bridged-
  callbacks list; clarify thinking_callback is currently set to None.
- provider-runtime.md: provider list was missing AWS Bedrock, Azure Foundry,
  NVIDIA NIM, xAI, Arcee, GMI Cloud, StepFun, Qwen OAuth, Xiaomi, Ollama
  Cloud, LM Studio, Tencent TokenHub. Fallback section described only the
  legacy single-pair model — corrected to the canonical list-form
  fallback_providers chain.
- environments.md: parsers list missing llama4_json and the deepseek_v31
  alias; both register via @register_parser.
- browser-supervisor.md: drop reference to scripts/browser_supervisor_e2e.py
  which doesn't exist in-repo.
- contributing.md: tinker-atropos is a git submodule — note that
  'git submodule update --init' is required if cloning without
  --recurse-submodules.

guides/
- operate-teams-meeting-pipeline.md: cron flags were all wrong — schedule is
  positional (not --schedule), the script-only flag is --no-agent (not
  --script-only), and there's no --command flag. Replaced with a real example
  that creates the script under ~/.hermes/scripts/ and uses the actual flags.
  Also replaced fictional 'hermes cron show <name>' with 'hermes cron status'.
- automation-templates.md: 'cron create --skills "a,b"' doesn't work —
  the flag is --skill (singular, repeatable). Fixed all 5 occurrences via AST
  rewrite.
- minimax-oauth.md: 'hermes auth add minimax-oauth --region cn' silently
  fails because --region isn't registered on the auth-add argparse spec.
  Pointed users at the minimax-cn provider (or MINIMAX_CN_API_KEY env) for
  China-region access.
- cron-script-only.md: 'hermes send' is fictional — replaced the comparison-
  table mention with a webhook-subscription pointer; also fixed the dead link
  to /guides/pipe-script-output (page doesn't exist).
- cron-troubleshooting.md: 'hermes serve' isn't a real subcommand. Pointed
  at 'hermes gateway' (foreground) / 'hermes gateway start' (service).
- local-ollama-setup.md: 'agent.api_timeout' is not a config key. The right
  knob is the HERMES_API_TIMEOUT env var.
- python-library.md: run_conversation() return dict has only final_response
  and messages — task_id is stored on the agent instance, not echoed back.
- use-mcp-with-hermes.md: '--args /c "npx -y …"' wraps the npx command in
  one quoted string, so cmd.exe gets a single arg instead of the multi-token
  command line it needs. Removed the surrounding quotes — argparse nargs='*'
  collects each token correctly.

integrations/
- providers.md: Bedrock guardrail YAML keys were 'id'/'version' (don't exist);
  actual keys are guardrail_identifier/guardrail_version (matches DEFAULT_CONFIG
  and the run_agent.py reader). GMI default base URL (api.gmi.ai/v1 ->
  api.gmi-serving.com/v1) and portal URL (inference.gmi.ai -> www.gmicloud.ai)
  refreshed. Fallback section rewritten to lead with the canonical
  fallback_providers list form (was leading with the legacy fallback_model
  single dict); supported-providers list extended to include azure-foundry,
  alibaba-coding-plan, lmstudio.

index.md
- '68 built-in tools' -> '70+'; '15+ platforms' was both inconsistent with
  integrations/index.md ('19+') and undercounted — bumped to 20+ and added
  Weixin/QQ Bot/Yuanbao/Google Chat to the list.

Validation: 'npm run build' clean (exit 0); broken-link count unchanged at
155 (same as round-1 post-skill-regen baseline). 24 files, +132/-89.
2026-05-09 15:00:24 -07:00

9 KiB

sidebar_position title description
12 Cron Troubleshooting Diagnose and fix common Hermes cron issues — jobs not firing, delivery failures, skill loading errors, and performance problems

Cron Troubleshooting

When a cron job isn't behaving as expected, work through these checks in order. Most issues fall into one of four categories: timing, delivery, permissions, or skill loading.


Jobs Not Firing

Check 1: Verify the job exists and is active

hermes cron list

Look for the job and confirm its state is [active] (not [paused] or [completed]). If it shows [completed], the repeat count may be exhausted — edit the job to reset it.

Check 2: Confirm the schedule is correct

A misformatted schedule silently defaults to one-shot or is rejected entirely. Test your expression:

Your expression Should evaluate to
0 9 * * * 9:00 AM every day
0 9 * * 1 9:00 AM every Monday
every 2h Every 2 hours from now
30m 30 minutes from now
2025-06-01T09:00:00 June 1, 2025 at 9:00 AM UTC

If the job fires once and then disappears from the list, it's a one-shot schedule (30m, 1d, or an ISO timestamp) — expected behavior.

Check 3: Is the gateway running?

Cron jobs are fired by the gateway's background ticker thread, which ticks every 60 seconds. A regular CLI chat session does not automatically fire cron jobs.

If you're expecting jobs to fire automatically, you need a running gateway (hermes gateway for foreground, or hermes gateway start for the installed service). For one-off debugging, you can manually trigger a tick with hermes cron tick.

Check 4: Check the system clock and timezone

Jobs use the local timezone. If your machine's clock is wrong or in a different timezone than expected, jobs will fire at the wrong times. Verify:

date
hermes cron list   # Compare next_run times with local time

Delivery Failures

Check 1: Verify the deliver target is correct

Delivery targets are case-sensitive and require the correct platform to be configured. A misconfigured target silently drops the response.

Target Requires
telegram TELEGRAM_BOT_TOKEN in ~/.hermes/.env
discord DISCORD_BOT_TOKEN in ~/.hermes/.env
slack SLACK_BOT_TOKEN in ~/.hermes/.env
whatsapp WhatsApp gateway configured
signal Signal gateway configured
matrix Matrix homeserver configured
email SMTP configured in config.yaml
sms SMS provider configured
local Write access to ~/.hermes/cron/output/
origin Delivers to the chat where the job was created

Other supported platforms include mattermost, homeassistant, dingtalk, feishu, wecom, weixin, bluebubbles, qqbot, and webhook. You can also target a specific chat with platform:chat_id syntax (e.g., telegram:-1001234567890).

If delivery fails, the job still runs — it just won't send anywhere. Check hermes cron list for updated last_error field (if available).

Check 2: Check [SILENT] usage

If your cron job produces no output or the agent responds with [SILENT], delivery is suppressed. This is intentional for monitoring jobs — but make sure your prompt isn't accidentally suppressing everything.

A prompt that says "respond with [SILENT] if nothing changed" will silently swallow non-empty responses too. Check your conditional logic.

Check 3: Platform token permissions

Each messaging platform bot needs specific permissions to receive messages. If delivery silently fails:

  • Telegram: Bot must be an admin in the target group/channel
  • Discord: Bot must have permission to send in the target channel
  • Slack: Bot must be added to the workspace and have chat:write scope

Check 4: Response wrapping

By default, cron responses are wrapped with a header and footer (cron.wrap_response: true in config.yaml). Some platforms or integrations may not handle this well. To disable:

cron:
  wrap_response: false

Skill Loading Failures

Check 1: Verify skills are installed

hermes skills list

Skills must be installed before they can be attached to cron jobs. If a skill is missing, install it first with hermes skills install <skill-name> or via /skills in the CLI.

Check 2: Check skill name vs. skill folder name

Skill names are case-sensitive and must match the installed skill's folder name. If your job specifies ai-funding-daily-report but the skill folder is ai-funding-daily-report, confirm the exact name from hermes skills list.

Check 3: Skills that require interactive tools

Cron jobs run with the cronjob, messaging, and clarify toolsets disabled. This prevents recursive cron creation, direct message sending (delivery is handled by the scheduler), and interactive prompts. If a skill relies on these toolsets, it won't work in a cron context.

Check the skill's documentation to confirm it works in non-interactive (headless) mode.

Check 4: Multi-skill ordering

When using multiple skills, they load in order. If Skill A depends on context from Skill B, make sure B loads first:

/cron add "0 9 * * *" "..." --skill context-skill --skill target-skill

In this example, context-skill loads before target-skill.


Job Errors and Failures

Check 1: Review recent job output

If a job ran and failed, you may see error context in:

  1. The chat where the job delivers (if delivery succeeded)
  2. ~/.hermes/logs/agent.log for scheduler messages (or errors.log for warnings)
  3. The job's last_run metadata via hermes cron list

Check 2: Common error patterns

"No such file or directory" for scripts The script path must be an absolute path (or relative to the Hermes config directory). Verify:

ls ~/.hermes/scripts/your-script.py   # Must exist
hermes cron edit <job_id> --script ~/.hermes/scripts/your-script.py

"Skill not found" at job execution The skill must be installed on the machine running the scheduler. If you move between machines, skills don't automatically sync — reinstall them with hermes skills install <skill-name>.

Job runs but delivers nothing Likely a delivery target issue (see Delivery Failures above) or a silently suppressed response ([SILENT]).

Job hangs or times out The scheduler uses an inactivity-based timeout (default 600s, configurable via HERMES_CRON_TIMEOUT env var, 0 for unlimited). The agent can run as long as it's actively calling tools — the timer only fires after sustained inactivity. Long-running jobs should use scripts to handle data collection and deliver only the result.

Check 3: Lock contention

The scheduler uses file-based locking to prevent overlapping ticks. If two gateway instances are running (or a CLI session conflicts with a gateway), jobs may be delayed or skipped.

Kill duplicate gateway processes:

ps aux | grep hermes
# Kill duplicate processes, keep only one

Check 4: Permissions on jobs.json

Jobs are stored in ~/.hermes/cron/jobs.json. If this file is not readable/writable by your user, the scheduler will fail silently:

ls -la ~/.hermes/cron/jobs.json
chmod 600 ~/.hermes/cron/jobs.json   # Your user should own it

Performance Issues

Slow job startup

Each cron job creates a fresh AIAgent session, which may involve provider authentication and model loading. For time-sensitive schedules, add buffer time (e.g., 0 8 * * * instead of 0 9 * * *).

Too many overlapping jobs

The scheduler executes jobs sequentially within each tick. If multiple jobs are due at the same time, they run one after another. Consider staggering schedules (e.g., 0 9 * * * and 5 9 * * * instead of both at 0 9 * * *) to avoid delays.

Large script output

Scripts that dump megabytes of output will slow down the agent and may hit token limits. Filter/summarize at the script level — emit only what the agent needs to reason about.


Diagnostic Commands

hermes cron list                    # Show all jobs, states, next_run times
hermes cron run <job_id>            # Schedule for next tick (for testing)
hermes cron edit <job_id>           # Fix configuration issues
hermes logs                         # View recent Hermes logs
hermes skills list                  # Verify installed skills

Getting More Help

If you've worked through this guide and the issue persists:

  1. Run the job with hermes cron run <job_id> (fires on next gateway tick) and watch for errors in the chat output
  2. Check ~/.hermes/logs/agent.log for scheduler messages and ~/.hermes/logs/errors.log for warnings
  3. Open an issue at github.com/NousResearch/hermes-agent with:
    • The job ID and schedule
    • The delivery target
    • What you expected vs. what happened
    • Relevant error messages from the logs

For the complete cron reference, see Automate Anything with Cron and Scheduled Tasks (Cron).