Cover V02, I04
Figure 1
Listing 1
Listing 2
Listing 3
Listing 4


A Disk Usage Report Generator

Leo Willems


Disks are by definition too small: this is a law of nature. Monitoring your filesystems to keep disks from getting full is the best remedy, but this is easier said than done. Potential disk-burners include:

  • Files that have been removed but are still open (these won't be dealt with in this article).

  • Files that grow extremely quickly (these can be located with a find command).

  • Files that only grow a little each day and are not deleted by some cron activity.

    The slower growing files (and sometimes the rapidly growing ones as well) pose a problem: where is the guilty file?

    As a system administrator I used to spend several hours per month trying to find out where the diskspace was being used. Of course, every day I studied the output of df, but that told me only what I already knew: diskspace was shrinking every day. When the 90 percent limit was exceeded or when I saw a jump of 10 percent or more downwards in the output of df I had to search the directory (and then find the person) who had abused my disks.

    This article discusses a tool that can give you a quick overview of how diskspace is being used and which directories are changing.

    A Disk Usage Report Generator

    The shell tool that I created does the following:

  • Every night, runs a du on every mounted filesystem.

  • Compares this night's data with yesterday's data with awk.

  • Reports differences.

  • Mails the report to me.

  • Occasionally, runs today's data on last month's data to select the slow-growing directories.

    I use four scripts: du_daily (Listing 2), a script executed by cron every night to run du on filesystems under examination; fss_daily (Listing 3), a script that starts the comparison and mails the result; (Listing 1), the script that actually does the comparison; and du_clean (Listing 4), a cleanup script to prevent my scripts from being blamed for using diskspace.

    The first lines in the du_daily script are dependent on your installation. All of these lines are marked with the text DEP. Configuration should not be difficult.

    How the Scripts Work

    The script, which compares the output of two du sessions, is run by the following command: -'+>' -o olddu -n newdu

    where "=" represents equal, "-" represents removed, ">" represents larger, "<" represents smaller, and "+" represents removed.

    The du_daily script runs every night. As part of its work, it calls fss_daily and, when fss_daily is finished, du_clean. du_clean keeps every report of the first day of the month. It also keeps the recently produced reports. You can configure in du_daily how many reportdays must be kept.

    Calling du_daily

    The cron command line to run du_daily is as follows:

    30 2 * * * (cd /home3/leo/fss; du_daily)

    The scripts are supposed to be executed in the same directory where datafiles and scripts are placed (i.e., /home3/leo/fss). On the first night will fail but the datafiles will still be created, so the run on the second night will succeed. Figure 1 shows sample output.

    About the Author

    Leo Willems is a UNIX systems programmer and consultant. For the last six years he has conducted and developed UNIX-related courses for AT Computing. At the present time, he is working for TUNIX Open System Consultants. He can be reached at