Cover V02, I03
Article
Listing 1

may93.tar


quota: A Gentle Enforcer

Sherwood Botsford

I take care of 20-odd UNIX boxes in a research lab. Many of our users run simulations that churn out vast quantities of data. Most of the time it's looked at once and forgotten. Some users forget to delete it, or get emotionally attached to their results and are reluctant to archive it to tape.

As a consequence, the normal state of a disk is full, and keeping more than 10 percent free space on a disk requires constant reminders and pleas to users to compress, discard, or move their data to tape. The obvious solution was some form of disk quota.

Of the four architectures at our site, two do not implement quotas at all (Next and Stardent) and the other two (Sun and AIX) don't implement them the same way. In any case, that didn't matter, as our servers were the machines that didn't have quotas. I wanted a system that would put steadily increasing pressure on users to do something about their files, but still give them some flexibility (I really didn't want to get calls on the weekend demanding more disk space). Thus came into being quota.pl, a Perl script to enforce disk quotas in a rather relaxed way -- and do so on a network basis, instead of file system by file system.

How quota Works

When quota is run, it sums the disk usage by home directory for each user and compares that to a table of quotas that (unlike BSD quotas) I can keep anywhere. If the user is over 90 percent of the quota, a warning message is mailed out. If the user is over quota, a somewhat sterner message is sent. Finally, if the user is over the limit for seven out of the last ten days, the account is disabled.

This still allows our users to go hog wild -- at least for a few days -- but it also puts pressure on them to clean things up. The nagging is automated (a relief for me, as I hated to do that), but they have to come to me to get more space (though most users will try to do something themselves first). When they do come, I can explain about compress, zip, and our UserBack tape script. If they have been good citizens this way, I will probably grant them another 10 meg or so.

The quotas file itself is a flatfile sorted by increasing quota. (I keep mine unreadable by anyone but root, not for security reasons, but to keep users from comparing quota allotments.) Users who don't have a quota are at the top, as a reminder to deal with those people who dropped through the cracks. New users can be added anywhere. The file is resorted when updated.

The format of the file is simple: user logon ID, quota in kilobytes, and a 10-character string that stores the user's over-quota history, with "O" for over and "W" for warning. The first character of the string is the most recent run of quota:

smith     25000 OOOOOOOOOO
jones     80000 -WOW------
wilson    80000 ----------
martin    80000 ----------
doe       100000 ----------
johnson   100000 OOWWW-----

The log file isn't much more complex. It has five fields: a four-letter indicator of the action, the user ID, quota history, and current disk limit.

Quota run at Thu Feb 25 17:44:41 MST 1993
OVER: johnson 	OOWWW----- 	U=106127 	Q=100000
DSLB: smith  		OOOOOOOOOO 	U=30129  	Q=25000
NOTQ: mair

Commentary on the Script

Customize lines 3 and 4 as you see fit. I try to keep anything that changes frequently on a partition that gets backed up nightly, hence the fairly long pathname. Some administrators may prefer to keep it all in /etc. $AdminDir is the directory where the quotas file and the log file are kept. Users start to get mail when they reach $WarnAt of their quota.

One of Perl's strong features is to treat a subprocess as a filehandle. I wanted both to keep track of what quota was doing on a daily basis and to have a running record of what it had done. So LOGFILE is a pipe that both appends to quota.log and sends me a single mail message summarizing its actions for the day (lines 11-15). [Note: either lines 11 and 12 must be joined, or the newline following "quota.log" must be escaped.]

Lines 17 through 22 read in the current quotas and store the quota and usage history as a pair of associative arrays keyed on user (@history and @quota).

The next block (26-39) gets a current list of all users from yellow pages then checks for disabled accounts (denoted by a * at the beginning of the encrypted password) and system accounts (group number less than 100). quota ignores these accounts.

At the same time, I pick up the user's home directory. All this information could be kept in the quotas file, but then I'd have to keep them synchronized, which is more than I want to do.

With the setup complete, processing of each user begins (line 44). First, the routine grabs the stored value for the user's quota. If the user is missing from the quotas file, then $quota{$u} is undefined. An undefined string becomes zero in a numeric context.

Line 50 gets the user's current disk usage by running du on his/her home directory. Putting $diskusage in parentheses forces an array context. Otherwise, split would return 2, for the number of fields it found. The first item returned by split goes into $diskusage, and the rest is dropped into the bit bucket.

Now comes a series of checks. There are four possible situations: the user may not have a quota; may be over quota; may be close to quota; or should be ignored. If the user doesn't have a quota, the routine prints a message to the log file and proceeds.

If the user has a quota, the check moves from worst to best cases. Each of these blocks concatenates a character onto the front of the history string and chops a character off the back.

The over-quota block (starting at line 59) treats the most serious case. The course of action to be taken depends on how many times the user has been over quota during the last ten days. (I allow for first time, multiple times, and final warning; if you are more imaginative, you can easily add more elsifs with appropriate subroutines for each day.)

In a scalar context, split returns the number of items in the list. However, I found that if the string ended with a delimiter (an "O"), it wasn't counted, so I add an arbitrary character ("X") before splitting. The -1 in line 57 is a result of n delimiters separating n+1 items -- and I need the number of delimiters.

The warning and okay blocks are simpler versions of the over-quota block.

quota closes by writing out a sorted list with the new histories. Sorting by quota puts the zeros at the top, where they immediately claim my attention when I edit the list. The formatted print statement forces a zero for quota, but the history string is left blank. Removing the +1n in the sort would give an alphabetical list.

At present each run of quota overwrites the existing file. If the process were interrupted at this point, the quotas would be lost. I could write to a temporary file, then move or copy it to the permanent file, but I thought that would be more trouble than it's worth, since our backup script runs two hours before the quota script does.

Subroutines

Most of the routines send a mail message to the user. This could have been embedded in the process user loop, but that would have made the structure of the loop hard to follow. This way, on a creative day, I can duplicate FirstOver, change its name to SecondOver, write another message, and add an elsif($NumOver == 2) {&SecondOver;} to the main loop.

The final action, disabling the user's account, is just a call to sed to replace the first colon with a colon and asterisk. I call make to update the NIS password map right after. This would be inefficient if I had thousands of users, but in a small lab, the odds are against having many people being disabled on the same day. If you have a large user list, write out all the sed pattern matches to a file, and then run sed once at the end.

About the Author

Sherwood Botsford manages a batch of UNIX boxes for the University of Alberta node of the Canadian Network for Space Research. You may contact him via email at sherwood@space.alberta.ca or c/o Physics Dept., University of Alberta, Edmonton, Alberta, Canada T6G 2J1.