Cover V02, I02
Article
Listing 1
Listing 2
Listing 3
Listing 4
Listing 5
Listing 6
Listing 7

mar93.tar


maxtab: Automatic File Pruning

Steven G. Isaacson

Disk space (or the lack thereof) is a perpetual problem. Unfortunately, little can be done by way of automation except to find and remove core files.

Some files, by their very nature, grow over time. Examples of these are system log files and mail outboxes, both of which are prime candidates for automatic file pruning. At first it might seem that truncating the file and saving a backup copy to tape would take care of the problem. This might very well work, if you have a way to automatically back up files and then clean them up. However, what you're usually interested in is the most recently entered information, a rolling history. You don't care about last week's or last month's log files. You simply want to have access to the latest entries without also having to remember to clean out the files.

I present here two automatic file pruning programs: maxsize.sh and maxtab. maxtab not only improves on the original maxsize.sh design -- it takes a new approach. I include both programs because in some cases the original may be more useful than the "improved" version.

maxsize.sh

maxsize.sh (Listing 1) is a shell script which makes it easy to limit files automatically to a predefined number of lines. It is designed to be run nightly via cron.

maxsize.sh works as follows. You specify a file limit -- for example, 1,000 lines. maxsize.sh first checks the file to see if it is less than or equal to 1,000 lines. If it is, the file remains unchanged. If it isn't, the last 1,000 lines are tailed and subsequently replace the original file. Thus only the most recent 1,000 lines are retained (assuming the file is always appended to).

I originally planned to have a configuration file that listed filenames and the maximum number of lines each file should have. The configuration file would be world-writeable, so that everyone could make use of the program. This meant that root's crontab would run maxsize.sh in order to have permission to work on files owned by various users.

But immediately two problems appeared. The first had to do with the use of a temporary file. The temp file stores the lines read from the end of the file. These lines are then used to replace the original source file. But the temp file is created by root with permissions determined by root's umask. So when the temp file replaces the original source file, the newly pruned source file gets the new temp file settings, and the original owner and permissions are lost. Suddenly you no longer own your own file nor can you edit it.

To address this problem, I added two more fields to the configuration file: permissions and ownership.

filename            lines perms owner
/usr/lib/sukill/sukill.log   3000  u+rw  root

With the owner and permissions settings stored in the configuration file, the original file could easily be restored to its proper settings after it had been pruned. This led to the second problem.

Recall that maxsize.sh is run by root. A world-writeable configuration would then allow anyone to create files of any ownership and with any permission setting -- a root-setuid program, say. Oops.

This second problem was addressed by having maxsize.sh su to the owner of the file before creating the temp file, and by checking for improper requests for permission settings -- that is, not allowing setuid bits. If an entry containing a setuid bit was made, the system administrator would be sent mail and the entry ignored.

But the malicious user problem still remains. Your log file can be reduced from 1,000 lines to 5 by a simple change to the configuration file. A bogus entry might also be made for a file you did not wish to have pruned (/etc/passwd, for example). All that is required is for someone to name the file and claim that the owner is root.

The ultimate resolution of these problems was less than ideal: namely, the configuration file is not world-writeable. Instead, a system administrator must administer it.

maxtab

maxtab (see Listing 4), the improved version of maxsize.sh, is patterned after crontab. It avoids the permission problems and does not require a system administrator. Users maintain their own maxtab files.

The maxtab file contains a list of filenames and line limits. Here is a two-line maxtab file:

/usr/stevei/mail/outbox 2000
/usr/work/logs/sccs.log 500

Root's cron runs the maxtab.sh script nightly. For each file it finds, it first sus to the owner of the file and then proceeds to prune the listed files. If the owner does not have permission to prune the listed file, the attempt fails, just as it would if the owner manually executed the command. No attempt is made to deal with ownerships or permissions. UNIX handles that. Only you can edit your maxtab file.

The temp file problem was handled differently. Recall that permissions on a new file are determined by umask setting. The problem remains so long as /bin/mv is used to overwrite the source file. The problem was solved by cating the contents of the temp file into the source file instead of moving it.

cat $tmpfile > $source
rm $tmpfile

Rather than creating a new file, this approach simply alters the contents of an existing file. The lines written out to the temp file make their way back into the source file, and the source file retains its original ownership and permissions.

Using maxtab

maxtab [file]
maxtab -r
maxtab -l

The maxtab file is stored in /usr/spool/cron/maxtabs/username and can be invoked with two options, -r and -l. The -r option removes the maxtab file from the maxtab directory; the -l option lists the contents of the user's maxtab file.

If called with no options, maxtab takes input from the standard input. If the maxtab file does not exist, it is created; if it already exists, it is replaced.

An additional script, maxedit, makes it easier to update the maxtab file. maxedit calls maxtab -l and writes the results (the current contents of your maxtab entry) to a temp file. After you make your changes with vi, maxedit asks if you would like to update your maxtab file. If you answer yes, it calls maxtab again and updates your maxtab file with the contents of the temp file. This makes working with your maxtab entry much easier.

If you link maxedit to cronedit, you can edit your crontab in the same way.

The Source Code

The files included here for the maxsize application are:

maxsize.sh (Listing 1) -- shell script run by cron

maxsize.rc (Listing 2) -- configuration file for maxsize.sh

flimit (Listing 3) -- standalone script to prune files

The files for maxtab are:

maxtab.c (Listing 4) -- C source to create /usr/bin/maxtab

maxtab.sh (Listing 5) -- shell script run by cron

mxfl (Listing 6) -- improved standalone script to prune files

maxedit (Listing 7) -- front-end script that simplifies editing maxtab and crontab files

Possible Enhancements

crontab supports "allow" and "deny" files. These files limit access to crontab. The same could easily be done with maxtab (for code to do just that, see Sys Admin, vol. 1, no. 4, "sukill: Stopping Unruly Processes").

Another nice enhancement would be to log the lines that get pruned from the file. This could be done easily by adding the name of the log file to the maxtab and providing mxfl with a new flag. The danger is that the new log file may also need to be pruned.

You might want to alter maxtab.sh or maxsize.sh so that they refuse to prune binary files.

Which to Use

maxsize.sh evolved into a sysadmin file management tool -- a useful product but not what we originally set out to build. It is bundled nicely in three files (two shell scripts and the configuration file) and enables the system administrator to easily maintain files with different ownerships. There is no need to wade through an assortment of maxtab entries. Installation is as simple as creating an entry in root's crontab to execute maxsize.sh and adding a list of files (and permissions, ownerships, etc.) to be pruned.

maxtab completed the task of building an easily maintainable and secure system for automatically pruning files, but at the expense of greater complexity. A special maxtabs directory is now required, and maxtab itself is a C program that must be compiled and altered to run with a setuid bit.

Given that the first program requires maintenance (which is what I wanted to avoid in the first place), and the second program is more likely to be used because it's more readily available, I would recommend using maxtab.

There may be cases, however, where maxsize.sh is appropriate -- for example, if you want to limit /usr outboxes to a certain size. An entry could be made for each user's outbox, with the appropriate ownership and permission settings, thus forcing all users to be good neighbors.

About the Author

Steven G. Isaacson has been writing C and Informix 4GL applications since 1985. He is currently developing automated testing tools for FourGen Software, the leading developer of accounting software and CASE Tools for the UNIX market. He may be reached via email at uunet!4gen!steve1 or steve1%4gen@uunet.uu.net.