checkcron: Checking for the Unexpected
Steven G. Isaacson
Most system administrators use daemons like cron and
to help keep their systems running smoothly. Daemons
in the background, starting up as needed to do their
work and then
going back to sleep until more work needs to be done.
Running in the background is good because for the most
part you don't
need to see what is happening. But running in the background
means that nothing obvious happens when things go wrong.
What Could Go Wrong?
On our main development system we use cron to bundle
code and then transfer it to various machines on the
files are bundled, transferred, then unpacked on the
One night errors were reported on the target system.
The next day
changes were made to the source code to correct the
problem, but that
night the same errors appeared. This went on for several
someone discovered that the new code had not been transferred
target system. The new code had not been transferred
This particular problem could be addressed by having
the target system
move or remove the file when it was done with it, which
a "missing file" error to be generated the
next night. But
that only addresses one part of a complex system.
What Else Could Go Wrong?
Recently NIS failed on our "communication box,"
dedicated to handling all of our incoming mail, that
is, mail from
outside of the company. Without NIS the alias file was
two days' worth of mail bounced back.
What's needed is a general solution, a way to check
processes that doesn't itself rely upon background processes.
A General Solution
First, how do you tell if a background process like
still running? Type ps -fu root, pipe the results to
and look for cron (on some systems you cannot specify
and so must look through all processes).
ps -fu root | grep cron
That's easy enough to make into a shell script, and
could echo a warning if grep exits with a bad exit status,
indicating that /etc/cron was not found. The script
for sendmail, NIS, ypbind -- any background processes
you want to keep tabs on.
But there are two problems.
The first problem is a technical one. You need to make
sure that you
find what you're looking for ... and not what you're
Let me explain.
When you type ps and grep for "cron," a new
process, with the word "cron" on its command
line, is started.
Sometimes that process shows up and sometimes it doesn't,
upon the load on the system. So if cron was found in
output, was it /etc/cron, "grep cron," or
So why not just look for /etc/cron?
Checking for /etc/cron doesn't work because as soon
grep for /etc/cron, /etc/cron shows up as an
argument on the grep command line.
Listing 1 illustrates this problem with two examples.
The first example
usually works, the second one never works. With the
addition of a
filter, you can make it always work.
ps -fu root | sed '/grep/d' | grep cron
The command sequence (ps | sed | grep) looks as
if it won't work because the grep-delete occurs before
call to grep.
But it does work. It works because it is only after
the shell has
parsed the command line that the three processes are
simultaneously). Before you can attach pipes, there
must be programs
to attach them to.
So, if the "grep cron" line appears in the
the sed command deletes it. If "grep cron"
appear, it's not deleted. Either way, you get the information
The Real Problem
The second problem is the real problem.
How do you automatically check background processes
to see if they
are still running? That is, how can you make it so the
script is run every so often without your having to
remember to do
it? (Don't say cron!)
I got around this automated process problem by using
my .profile. I simply added a call to checkcron. Now
whenever I log in I know within a few seconds if there
is a problem.
checkcron is in Listing 2.
Installation is trivial. Customize the program for your
editing the line with the list of daemons), and then
add one line
Every time you log in, you'll see a background pid
number echoed to your screen and then whatever you normally
you log in.
For each daemon that cannot be found, checkcron echos
message to your screen. If all daemons are accounted
for, it does
checkcron may also be run from the command line if you
been logged in for a while and simply want to double-check
About the Author
Steven G. Isaacson has been writing C and Informix
since 1985. He is currently developing automated testing
FourGen Software, the leading developer of accounting
CASE Tools for the UNIX market. He may be reached via
uunet!4gen!steve1 or email@example.com.