Cron jobs power the invisible backbone of modern infrastructure. Database backups, ETL pipelines, certificate renewals, report generation, cache warming — they all run on schedules, often on machines nobody is watching. The problem is simple: when a cron job fails, nothing happens. No alert. No notification. Just silence.
This guide covers every practical approach to monitoring cron jobs, from basic log tailing to fully automated anomaly detection. By the end, you will have a working monitoring setup that takes under two minutes to configure.
The cron daemon was designed in the 1970s. Its failure notification mechanism is email — specifically, local email via sendmail. On most modern servers, local mail delivery is either disabled or nobody reads the mailbox. This means a cron job can fail every single run for weeks before anyone notices.
Common failure modes include:
node, python3, or pg_dump may not be found..bashrc are not available in cron.The simplest monitoring approach is redirecting output to a log file:
This captures both stdout and stderr. You can then grep the log for errors, or use logrotate to manage file size. The problem: nobody checks log files proactively. You only look when something is already broken.
A step up is wrapping your cron command in a script that checks the exit code and sends an alert:
This works but has several drawbacks. You must write and maintain a wrapper for every job. You only detect failures, not missed runs. If the server itself goes down, no alert fires because the wrapper never executes. And you have no visibility into trends — was the job slower than usual? Did the row count drop?
The industry standard for cron monitoring is the dead man's switch pattern, also called heartbeat monitoring. The idea: your cron job pings an external service after each successful run. If the service does not receive a ping within the expected interval, it alerts you.
This solves the "server is down" problem because the monitoring service is external. It also detects missed runs, not just failures.
Crontiq implements the dead man's switch pattern with zero configuration. Here is how to set it up:
Register at crontiq.io/register with your email. You will receive an API key that looks like cq_live_80381902d7b36613.
Append a single curl command to your existing crontab entry:
That is the entire setup. The first time the ping hits Crontiq, the monitor nightly-backup is automatically created. No dashboard configuration needed. Crontiq infers the expected schedule from the ping frequency.
To distinguish between success and failure, use the /fail endpoint:
The && runs the success ping only if the script exits with code 0. The || runs the fail ping if it exits with any other code.
Crontiq's real power is its Magic Engine. If your cron job produces data, POST it as JSON:
Crontiq automatically extracts every numeric value from the JSON, stores it as a time series, and applies anomaly detection. If rows suddenly drops from 10,000 to 50, or duration_seconds jumps from 120 to 3,600, you will get an alert.
Crontiq uses a rolling average over the last 10 data points for each metric, combined with a 2-sigma threshold. In plain terms: if a value deviates more than two standard deviations from recent history, it is flagged as anomalous.
This approach requires zero configuration. You do not define thresholds, expected ranges, or alert conditions. Crontiq learns what "normal" looks like from the data your jobs send. It works for row counts, file sizes, durations, error rates, or any numeric value.
When an anomaly is detected, the monitor status changes to WARNING (shown in purple on the dashboard and badge), and an alert email is sent. Alert emails are rate-limited to one per hour per monitor to prevent spam during sustained anomalies.
Crontiq handles nested JSON structures by flattening them with dot notation:
Each metric gets its own time series and anomaly detection. No schema definition required.
Several services offer cron monitoring. Here is how Crontiq differs:
Here is a complete example for monitoring a PostgreSQL backup job:
With this setup, Crontiq monitors three things simultaneously: backup file size (detects truncated or corrupt dumps), duration (detects performance degradation), and table count (detects accidental schema changes). All without any configuration on the Crontiq side.