Article

Automatically Cleaning Temporary Directories

Paul A. Sand

This article presents a method of cleaning temporary directories on a UNIX system, removing old files and unused directories. Although the method is simple, it illustrates a number of interesting issues in UNIX programming. Programs in C and Perl that implement the method are included.

Temporary Directories

UNIX systems traditionally have had one or more temporary directories for short-term storage of files. Common pathnames for such directories are /tmp, /usr/tmp, /var/tmp, and /usr/var/tmp; symbolic links can be used to make some or all of these names refer to the same actual directory.

Temporary directories are world-writeable; any user can create files in them. In addition, many system programs (compilers, editors, etc.) automatically create files in temporary directories. Even on systems that impose disk quotas, the temporary directories often reside on quotaless filesystems, allowing users to store files there that they couldn't otherwise accommodate.

Given such universal access, files tend to accumulate in temporary directories. Users may fail to delete them, intentionally or unintentionally, while programs that automatically generate such files may fail (for any number of reasons) to remove them before exiting.

This leads to an obvious potential problem: if a filesystem that holds a temporary directory fills up with junk files, users will be unable to create additional files there. Worse, the system programs that expect to be able to create files in a temporary directory will fail, often accompanied by mystifying error messages and occasional destructiveness.

Policy

It's important, then, that debris in temporary directories not be allowed to accumulate unchecked. In many (perhaps most) situations, this means that the system administrator will need to delete files periodically according to preannounced and clearly understood criteria: in short, you need a policy.

I won't address here the inherently non-portable issue of policy development. A somewhat more portable issue, however, is communicating the policy to the users of the system: the policy implemented should be clearly spelled out in all necessary detail, and broadcast by the methods you commonly use for local announcements and documentation.

For the purposes of this article, the criteria to be used to delete files should be objective enough to allow a program to do the drudgery of sifting through the directory and removing the files that deserve it.

Technical Issues

Age is an obvious criterion to use in deciding which files to delete from a directory: older files should be deleted in preference to younger ones. UNIX complicates this issue somewhat by not really keeping track of a file's creation date: instead, UNIX stores three dates for each file: (1) the time the file was last accessed; (2) the time the file was last modified; and (3) the time of the last change to the file's inode status. (These three numbers are, respectively, the st_atime, st_mtime, and st_ctime fields of the stat structure returned from a stat(2), lstat(2), or fstat(2) system call.)

The last-modified time might seem to be a good choice for calculating a file's "true" age. There's a problem with that, however, as illustrated by this example: suppose a user has just extracted some files from a tar archive file or tape into a temporary directory. The tar extraction typically sets the last-modified date of such files to be the same as they were when they were inserted in the archive. This would make such files appear incorrectly to be "old" even though they had just been created in the directory.

In general, user programs have complete control (via the utime(2) system call) of a file's last-accessed and last-modified times -- including setting those times into the future! For that reason, the inode status change time is the most reliable measure of a file's age.

Things are slightly complicated by the fact that users can create their own directory trees of arbitrary depth within temporary directories. For example, a user might extract the entire contents of a tar archive into a temporary directory. Or, since filenames are world-readable at the top level of a temporary directory, users desiring privacy (or trying to avoid filename conflicts) might prefer that their files be hidden inside a world-unreadable subdirectory.

In such cases, it might be reasonable to allow files below the top level of a temporary directory to live longer than those at the top level. Although this probably isn't a major issue, users have to put forth a conscious effort to create such subdirectories, and it's sensible to assume that files there are a little more important.

Finally, subdirectories within a temporary directory should probably be deleted using different criteria than their age. Specifically, deleting all the files from within a directory will make the directory itself appear young because the deletion modifies the directory. But there's not much point in keeping an empty subdirectory around within a temporary directory; it's probably safe to delete no matter how young it is.

There are any number of other possible criteria for determining whether files or directories should be deleted from a temporary directory: for example, you might want to save an old file from deletion if it has been recently accessed; you might want to exempt certain users or group members from having their files deleted; and so on. Once the basic system has been implemented, these criteria should be relatively easy to add to the cleaning process.

The specific policy implemented as an example here is as follows:

Two temporary directories, /tmp and /usr/tmp, are automatically cleaned.

Files at the top level of a temporary directory are removed when they are over one day old.

Files in subdirectories of a temporary directory are removed when they are over three days old.

A file's age is determined by its inode status change time.

Empty subdirectories will simply be deleted, unless they are owned by root.

This last exception will prevent (for example) deletion of the lost+found directory used by fsck(8), should that exist in a temporary directory.

Implementation

If the example policy were slightly simpler, a few periodically executed find(1) commands would no doubt be the best choice for implementation. For example, this removes all files in /tmp with ages over three days:

find /tmp -type f -ctime +3 -exec \
/bn/rm -f {};

For a more complex policy, though, it's probably better to bite the bullet and write a program to do the cleanup according to the exact rules. Since the program will most commonly be used on a periodic -- and probably unattended -- basis, it's important that it be robust: for example, it shouldn't be confounded by unexpectedly deep directory hierarchies or other activity in the filesystem. It should be readily modifiable, so that when the underlying policy changes, the program can be easily brought into conformance. Finally, it should be readily portable to any UNIX system.

The program presented here, cleantmp, has been written in C and Perl. Both versions of the program share the same algorithm and command syntax. cleantmp is designed to be called as follows:

cleantmp [-v] [-n]

The -v and -n options are for debugging purposes. Specifying the -v (verbose) option will cause the program to output the age and status (whether it is to be deleted or spared) of each file and subdirectory in the temporary directories it is cleaning. The -n (no execute) option will cause no deletion to be done.

Since directory trees can be created to arbitrary depths within a temporary directory, the program needs to be able to navigate up and down the tree without losing its way. Many versions of UNIX provide the ftw(3) library function, which traverses a directory hierarchy, calling a user-defined function for each object seen. Unfortunately, this won't work for cleantmp: a minor problem is that ftw(3) visits a directory before visiting the objects in the directory; for cleaning purposes, it's better to do it the other way around. The major problem with ftw(3), however, is that it calls stat(2) to return the inode data for each object. For symbolic links, stat(2) gives data on the linked-to object rather than the link itself; if the linked-to object is a directory, that directory would also be recursively traversed and (in this program) cleaned. Given that users can create symbolic links with few restrictions, it almost goes without saying that this could have effects ranging from undesirable to catastrophic.

The solution is to use lstat(2) instead of stat(2); for symbolic links, lstat(2) returns data on the link itself. Some versions of UNIX provide nftw(3), a file-tree-walker that uses lstat(2), but not enough so that use of nftw(3) can be considered widely portable.

So cleantmp needs to have its own file-traversal algorithm. A recursive function is a natural choice for such a traversal; in the C version of the program, the recursive function is declared as

unsigned clean(const char *dirname,
int level)

where dirname is the pathname of the directory to be cleaned, and level is the depth of the directory below the top of the temporary directory. (The level argument is needed because of the different age limits for files at the top level of the temporary directory versus those in subdirectories.) The function returns the number of entries in the directory after cleaning; this allows the program to delete directories that have become empty during the cleaning process.

The main program, then, simply kicks everything off with two calls to clean(), for example (in C):

clean("/tmp", 0);
clean("/usr/tmp", 0);

The algorithm for the clean function is best summarized in pseudo-code:

clean(dirname, level):
make a list of files in dirname
change working directory to dirname
for each file in list:
if "." or "..", skip
call lstat(2) on the file
if the file is a directory:
recursively clean the directory
(at level+1)
if the directory is empty after cleaning
(and not owned by root), rmdir(2) it.
else (not a directory):
calculate age
if it's too old, delete it
endif
end for
change working directory to parent directory
return number of entries in cleaned directory

A couple of details here are worth pointing out: the algorithm specifies that the directory entries be read all at once into a list instead of being read and processed one at a time; this avoids keeping each directory open during a traversal down a directory hierarchy. Otherwise, the program could easily hit the system's limit on the number of file descriptors opened by a single process.

For similar reasons, the clean() routine uses the chdir(2) system call to move the working directory of the process up and down the hierarchy. An alternate method would have been to avoid chdir(2) and keep track of the entire current pathname in a string variable, using it to specify pathnames to unlink(2) and rmdir(2) for deletion. However, since the hierarchy can be arbitrarily deep, the pathnames can become very long; this is difficult to deal with in a portable and robust manner. The chdir(2) method avoids this, but as a tradeoff encounters another (at least theoretical) disadvantage: it's possible to become lost during the traversal if another process deletes the parent directory of the one you're working in. In that case, the chdir(2) to the parent directory would fail, and the current version of the program would die with an error message. (In months of actual use, however, this hasn't occurred.)

C Program Overview

The C version of the program (Listing 1) uses only standard C and UNIX library function calls and it should be portable to any UNIX system. The main program begins by processing any command-line options (via getopt(3)), complaining with the usual usage error message if bad options or extra arguments are specified. It then records the current time in the variable now, and calls the recursive clean() function twice to do the real work.

The clean() function, as described above, first makes a list of the entries in the specified directory. The C version implements this as a simple linked list, calling make_filelist() to construct it. clean() next calls chdir(2) to move to the directory in question, then traverses the list, processing each entry in sequence.

For each entry in the list -- excepting the entries for the directory itself (.) and its parent (..) -- the program calls lstat(2) to obtain the relevant information about the entry. If the entry refers to a subdirectory, clean() is called recursively to clean that subdirectory, returning the number of entries remaining after cleaning. If the subdirectory (after cleaning) is empty, then it is deleted (unless, as stated above, it is owned by root).

If the entry is not a directory, the function calculates the file's age by using the C library function difftime(3) on the time recorded in the now variable and the inode status change time returned from the call to lstat(2); if the file is older than the specified range, the program calls unlink(2) to delete it.

It should be pointed out that the program stays sane when files get deleted by other processes during its operation; in fact, this is a relatively common occurrence in actual use. The only action is to print a warning message and continue. (The exception, as described above, is when entire directories get deleted by another process.)

Finally, after all entries in the directory have been processed, the clean() function frees up the linked list of entries and calls chdir() to return to the parent directory; this last step is necessary for the recursive calls to the function.

The remaining functions in the C program are relatively trivial, but perform tasks that are generally useful in many programs. Table 1 lists their names and purposes.

Perl Program Overview

The Perl version of the program (Listing 2) is identical in operation and method to the C program (though the C version does somewhat more exhaustive error checking). The Perl language contains many built-in features which make the program source code much shorter than the C version. For example, a single statement in Perl can make a list of names from an opened directory.

Summary

This program has been in use on a number of systems at the University of New Hampshire for well over a year. The most common use is to run it twice a day automatically by means of a cron job. While it does a good job of removing unwanted files from temporary directories, it does have some limitations: specifically, users can exempt files from deletion by using touch(1) periodically to make them appear young again.

This can be a problem for systems (like ours) with a few users who store files in the temporary directories on a long-term basis to escape their disk quota limits. It is easy to think up ways to modify cleantmp to deal with such evasive tactics automatically; unfortunately, it's even easier to think up ways to defeat those modifications. The bottom line is: it is difficult, if not impossible, for an automatic program to keep determined users from maintaining files in a temporary directory on a long-term basis.

Our solution was to insert an "escape clause" in the announced policy, to the effect that extremely large files or directories may be removed at any time if the filesystems containing the temporary directories become 90 percent full. The filesystems are monitored periodically (also automatically), and mail is sent to the system administrator if the 90 percent limit is breached. The system administrator will then judiciously perform any deletions necessary to bring the filesystem back to a safer status. This seems to be a workable method; it doesn't eliminate abuse, but keeps it to levels where other users aren't hindered from getting their work done.

About the Author

Paul Sand is a system administrator for academic UNIX systems at the University of New Hampshire. He has worked with UNIX systems since 1984. He may be contacted via email as pas@unh.edu.