What to Do When Cron Jobs Fail Silently — Debugging Guide

You just discovered that your nightly database backup has not run in three weeks. Or your certificate renewal script stopped working after a server update. Or the ETL pipeline that feeds your analytics dashboard has been producing empty files since last Tuesday. Nobody noticed. Welcome to the most common operations problem in the industry: the silent cron failure.

This guide walks through the most frequent causes, how to debug them, and how to prevent them from ever happening again.

The 7 Most Common Causes of Silent Cron Failures

1. Wrong PATH

The cron daemon runs with a minimal environment. On most Linux systems, the cron PATH is /usr/bin:/bin. If your script calls node, python3, docker, aws, or any binary installed outside those directories, the command will fail with "command not found" — and you will never see the error message.

Fix: use absolute paths in your crontab, or set PATH explicitly at the top of the crontab file:

PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin 0 2 * * * /usr/local/bin/python3 /opt/scripts/etl.py

2. Missing Environment Variables

Your script works perfectly when you run it manually because your shell session loads .bashrc, .profile, or .env files. Cron does not load any of these. Database URLs, API keys, AWS credentials, and other environment variables are simply absent.

Fix: source your environment file explicitly, or define variables in the crontab:

DATABASE_URL=postgres://user:pass@localhost/mydb 0 2 * * * /opt/scripts/backup.sh # Or source a file 0 2 * * * . /opt/scripts/.env && /opt/scripts/backup.sh

3. Permission Errors

Cron jobs run as the user who owns the crontab. If you edited the crontab as root but the script expects to run as www-data, or if the script tries to write to a directory owned by another user, it will fail silently. Similarly, scripts that rely on SSH keys or GPG keys will fail if the cron user does not have access to the keyring.

4. Email Delivery Not Configured

Cron's built-in failure notification mechanism sends an email via the local MTA (usually sendmail or postfix). On most modern servers, especially cloud VMs and containers, local mail delivery is not configured. Even when it is, the mail lands in a local mailbox that nobody checks. The MAILTO crontab variable can route to an external address, but only if the MTA is set up for relay.

5. Script Exits with Code 0 Despite Errors

Many scripts do not use set -e (exit on error). A Python script might catch all exceptions and print them to stdout without actually exiting with a non-zero code. From cron's perspective, the job succeeded. The output was a stack trace, but cron does not parse output — it only checks exit codes.

#!/bin/bash # BAD: errors are swallowed python3 etl.py 2>&1 | tee /var/log/etl.log # GOOD: propagate exit code set -euo pipefail python3 etl.py 2>&1 | tee /var/log/etl.log

6. The Server Went Down

If the server reboots and the cron daemon does not restart, or if the machine is simply offline, no cron jobs run at all. There is nothing on the machine to send an alert about this — the alerting system itself is down. This is the fundamental limitation of any monitoring that runs on the same machine as the job.

7. Crontab Was Overwritten or Deleted

Someone ran crontab -e with the wrong editor, saved an empty file, and wiped all scheduled jobs. Or a deployment script replaced the crontab without including all existing entries. The jobs simply stop running, with no error and no trace.

How to Detect Silent Failures

After you find the root cause and fix it, the real question is: how do you make sure this never happens again?

The answer is external monitoring using the dead man's switch pattern. Instead of trying to detect failures from inside the machine, you use an external service that expects to receive a heartbeat from your cron job. If the heartbeat stops, the external service alerts you.

Prevention: Set Up a Dead Man's Switch with Crontiq

Crontiq implements the dead man's switch pattern with auto-provisioning. Here is how to add it to an existing cron job in 30 seconds:

# Original crontab entry 0 2 * * * /opt/scripts/backup.sh # Add Crontiq monitoring (auto-creates the monitor on first ping) 0 2 * * * /opt/scripts/backup.sh \ && curl -fsS --retry 3 -o /dev/null \ https://ping.crontiq.io/p/cq_live_YOUR_API_KEY/backup-prod \ || curl -fsS --retry 3 -o /dev/null \ https://ping.crontiq.io/p/cq_live_YOUR_API_KEY/backup-prod/fail

The -fsS flags make curl fail silently on HTTP errors, show errors on connection failures, and --retry 3 handles transient network issues. The && ensures the success ping only fires if the script exits cleanly. The || fires the fail ping on any non-zero exit.

If the job does not run at all — because the server is down, or the crontab was wiped — Crontiq will notice the missing ping and alert you. This is the key advantage over log-based monitoring: the absence of a signal is itself a signal.

Track Duration

You can also track how long your job takes by signaling the start:

0 2 * * * curl -fsS -o /dev/null \ https://ping.crontiq.io/p/cq_live_YOUR_API_KEY/backup-prod/start \ && /opt/scripts/backup.sh \ && curl -fsS -o /dev/null \ https://ping.crontiq.io/p/cq_live_YOUR_API_KEY/backup-prod

Crontiq calculates the duration between the /start and the completion ping. If the duration exceeds historical norms (using the same 2-sigma anomaly detection), you get a warning.

Send an Exit Code

For scripts that use specific exit codes to indicate different failure types, pass the exit code directly:

0 2 * * * /opt/scripts/backup.sh; curl -fsS -o /dev/null \ https://ping.crontiq.io/p/cq_live_YOUR_API_KEY/backup-prod/$?

The $? variable contains the exit code of the previous command. Crontiq treats exit code 0 as success and anything else as failure.

Debugging Checklist

When you discover a silently failing cron job, work through this checklist:

Check /var/log/syslog or /var/log/cron for cron execution entries
Verify the crontab exists: crontab -l
Run the command manually as the cron user: sudo -u cronuser /opt/scripts/backup.sh
Check exit code: echo $?
Verify PATH: env -i PATH=/usr/bin:/bin /opt/scripts/backup.sh
Check disk space: df -h
Check permissions on the script and all accessed files
Add external monitoring (Crontiq) to prevent recurrence

Never miss a failed cron job again