An older Option in C for find
Why a C Program?
UNIX comes with so many utilities that a lot of work
can be done by
shell scripts that use those utilities -- and shell
usually be put together faster than C programs. Sometimes,
when the utility that does exactly what you want simply
does not exist,
you just have to write a C program.
A client of mine is putting together a system that will
from customers transmitted by uucp. These files will
come in daily
from all over the country and will contain various transaction
that need further processing locally. At least once
a day, we want
cron to wake up on my client's machine and move the
of the uucppublic directory into a directory where the
processing can be done. However, new files can come
in literally at
any time, including the time that the cron job wants
the files away. The file currently being transmitted
by uucp must
not be moved away while it is still being uploaded,
but all prior
files will qualify.
My first thought was to use the find utility, knowing
it has a bunch of interesting options for qualifying
files and then
emitting the names of those that qualify. None of the
(such as -atime) works with arguments of minutes, only
find has a -newer option that allows a specific file's
timestamp to be the base time, so that all files newer
than that time
will qualify. What we really needed was an -older option,
we didn't want to take the file being uploaded currently,
want all files prior to that one. Using ! -newer might
trick if I could touch a file with the appropriate timestamp.
However, find has no -older option that would work with
a specific amount of time in minutes, so I decided to
I wanted this program to act as if I had given a find
that had this unsupported syntax:
find dirname -older
where dirname was the name of the directory containing
files I cared about, and num_minutes was a number, not
If any file in that directory was older than the specified
the pathname for that file would be printed out. Since
most of the
customer files would take only a couple of minutes maximum
at 2,400 bps, and a few might take as long as 5 to 10
older than 15 minutes would qualify. Anything more recent
would be picked up next time around.
How older.c Works
older.c, shown in Listing 1, meets these requirements.
it more generally useful, the program accepts a number
and a set of directories on the command line. While
application would work with a 15-minute timeframe and
only with a
specific directory, I was sure we would find other uses
have different requirements. If no time is given, the
to 15 minutes, and if no directories are named, it defaults
Looking at Listing 1, start with the FEBDAYS() macro.
implements the rule that it is leap year in every year
4 except the century years, which must be divisible
by 400. The mod
(%) operator gives the remainder after a division, so
if the result
of a mod operation is zero, the value is evenly divisible
the divisor. C implements short-circuiting for the Boolean
and (&&) and or (||). This means that if the
can be determined by the first part of the operation,
the second part
does not need to be evaluated. If the first part of
an && operation
is FALSE, the whole thing is FALSE. If the first part
of an || operation is TRUE, the whole thing is TRUE.
The opposite requires that the second part be evaluated
to be certain.
So, if the year is 2000, which is evenly divisible by
400, this routine
will set February to 29 days without checking further.
A year that
is not a century year is recognized as a leap year if
it is evenly
divisible by 4. Instead of using the % operator for
by 4, whenever the divisor is a power of 2, the binary
operator (&) can be used with a value one less than
the divisor (3)
to get the remainder. The quotient, if needed, can be
using a shift to the right (>> ) instead. The
and binary operators are faster than the mod and division
The variable progname is made global for error handling.
the program reports an error, the program's name will
be part of that
error message. Since any of the functions might deliver
an error message,
but only main() can know the program name from the command
line, using a global variable to hold the name eliminates
to pass it around to all the functions as a parameter.
So, the first
thing that main() does is grab argv, the program's
name, and put it into that variable. No matter how many
will be given on the command line, even the wrong number,
will be present.
The next step in main() is to check the command line
list. The total number of arguments on the command line
is in argc,
the arguments themselves being in argv. For this program,
no arguments need be given, but there are no hyphenated
if a hyphen is the first character of the first option,
takes it as a request for help and outputs a Usage message.
If no arguments other than the program's name are given,
will be 1 and the default time of 15 minutes will be
taken off the
current system time. Otherwise, the command line has
the number of
minutes to be taken off. The program will presume that
only an int
capacity will be needed. (On most UNIX systems, an int
same size as long, so this can be quite a number of
If a long versus short is strictly required, one of
them should be used, not int. While the int is theoretically
the most efficient type for the system [this is a religious
and different compiler writers will disagree for the
it is also the least portable type since the ANSI C
it to be the same size as short or long, or somewhere
in between.) If an int is 16 bits, the maximum value
of a signed
int is 32,767 minutes, which amounts to over 22 days.
is 32 bits, the maximum signed value is 2,147,483,647
amounts to almost 4,083 years! Either way is sufficient
for this program's
If a minutes argument is given on the command line,
may also be given (directory names may not be given
without a minutes
argument). If no directories are named, the current
directory is used.
If directories are named, a loop runs through them one
at a time.
If an error occurs, the program quits the loop.
The reduce_time() function takes the passed number of
off the current time. While the ANSI C Standard gave
us a lot of flexibility
in working with calendar and clock times, it did not
arithmetic functions. The closest it came to that was
function, which takes two time_t values and subtracts
giving the difference as a type double.
It is extremely important to avoid the trap of assuming
that the time_t
values are arithmetic types. While this may be the case
compilers, some might use a structure instead. A specific
date and time can be converted into a time_t by the
function, but a specific amount of time cannot be added
from a time_t value, since there is no guarantee that
is a number of seconds. Moreover, even where compilers
a time_t as a number of seconds elapsed from an epoch,
cannot assume that all will use the same epoch. difftime()
allows you to handle these differences.
While it would be easy to multiply minutes by 60 and
take the resulting
seconds off the current time represented as a time_t
to get the starting time for the timestamp comparisons,
to do so would
risk making the code nonportable. The only truly portable
is to go through the struct tm data type and muck around
the various parts of the calendar and clock.
Therefore, I take the current time() and plug that into
localtime() function, which translates the time_t value
into calendar and clock information for the local timezone.
and hours can be adjusted easily, and the days will
be whatever is
left over if enough minutes were given. I take the total
to 59) and subtract those from the current time's minutes.
result means that the time crossed backward into the
so I add the hour back into the minutes and subtract
one from the
hour. I do the same thing with the hours, except this
time a negative
result means a cross back into the previous day.
These steps may seem to be a lot of trouble, but if
the current time
is just a few minutes after midnight, the subtraction
will have to
deal with a day on the calendar. The real problem is
in the lack of
standard functions for doing date arithmetic. Maybe
the ANSI committee
will do something about this the next time around. While
requirements of, say, 30-, 60-, and 90-day aging or
more are usually
met by adding 1, 2, 3, or more to the month number rather
a strict number of days, other applications might need
to be more
precise. It would help tremendously if mktime() would
unusual numbers in its struct tm argument. Then, if
given more seconds, minutes, hours, days, or months
than is reasonable
-- or even a negative value -- it could convert the
the correct calendar amount and hand back the adjusted
value with leap years, timezones, and so forth accounted
Once the problem has been reduced to a specific number
of days by
which the calendar should be adjusted, a loop is needed
to work within
the days of each month. Leap day fluctuation is accounted
for by adjusting
February's days (day). If the number of days to be
from the date is greater than the day of the month,
I reduce the day
of the month by that number of days; since this brings
date back to the previous month, I reduce the month
number. If reducing
the month number requires it, I reduce the year also
February's days. Regardless, I take the number of days
in this new
month and repeat the operation until the number of days
to be taken
off becomes less than the value of the day of the month.
At this point,
I take that number of days off, and the correct date
of the adjusted
month (in the adjusted year if needed) is delivered.
By building this
directly into the struct tm, the result can be passed
to mktime(), which returns the resulting time_t value
the reduce_time() function.
Two interesting side-effects result from this. First,
a negative number is equivalent to adding, a negative
number of minutes
will add minutes to the current time. While not useful
for this particular
application, since it works with file timestamps, this
could be handy for other programs using this function.
mktime() takes the timezone and Daylight Savings Time
consideration, the result will be plus or minus an hour
on whether the new time has crossed over one of the
DST boundary dates.
This program is calculating an absolute time in minutes
to adjustment of clocks made at DST boundary dates,
so the hour lost
or subsequently recovered will show up in the difftime()
the starting time and the new reduced time. Nevertheless,
is a correct absolute number of minutes prior to the
One remaining issue about the reduce_time() function
be to eliminate its association with the current time.
calculating the now variable from the time() function,
you could pass it into reduce_time() as a parameter,
now. With that, a specific number of minutes can be
from (or added to by using a negative number of minutes)
Finding the Target Files
The show_files() function takes a directory name and
time. It comes up with every filename in the specified
checks each file's timestamp against the starting time.
filenames from a directory is no more complicated than
from a sequential data file. The directory is opened
with the opendir()
function, the filenames are delivered with the readdir()
in a structure, and the directory is closed with the
The function takes the given directory name, opens the
and copies the name into the pathname variable. A
slash is concatenated to it and the null terminator
is replaced to
make it a regular string again. Notice that strlen()
to figure the subscript of the terminator. That information
into a direct placing of the / character on top of the
without having to use strcat(), which would make yet
pass through the string to find that terminator. (The
needs to be increased to represent the adding of that
Since the length of the pathname part containing the
is known, every filename within that directory can be
the same path, once the name is discovered. All you
have to do is
keep track of where the pathname part ends -- and that
dirlen is for.
pathname variable is set to 256 characters, allowing
full pathname to be no more than 255 characters. Since
BSD and SVR4
allow 255-character filenames, the path added to that
this buffer, so this is not a particularly safe strategy.
method should work with most pathnames, and serves to
keep the example
simple. A more robust solution would allocate the buffer
heap and allow it to grow on demand. You might want
to challenge yourself
to rewrite it that way.
The readdir() loop checks the file's name to see if
character is a dot (.). The readdir() function delivers
name, including the directory names . and .., and the
hidden files beginning with a dot. The program should
files, so, if the name does not begin with a period,
the program appends
it to the pathname variable and passes the result
stat() function. This handy function reads the inode
for that file, delivering all sorts of useful facts
about the file,
including the time of the last modification (st_mtime).
time is a time_t type, so it can be plugged directly
The difftime() function delivers the difference between
time_t values in seconds, represented as a double type.
Treating time as an increasing value, regardless of
the form of that
value, difftime() subtracts the second argument from
If the result is greater than zero, the file's modification
must be older than the start_time, and so the file's
The older program emits full pathnames as the -print
option might do in the find program. Since we have started
using it in shell scripts, we have found additional
uses for it. I
hope you'll find it equally handy.
About the Author
Larry Reznick has been programming professionally since
He is currently working on systems programming in UNIX
and DOS. He
teaches C language courses at American River College
and is the owner of Rezolution Technical Books. He can
via email at: email@example.com.