Cover V08, I11


Controlling Spam

David Wartell

As the systems administrator at any organization with a dedicated Internet connection already knows, unsolicited commercial email, or “spam”, is a hot topic. Users of your network, your co-workers, your boss, and even the government have something to say about unsolicited email, and none of it's nice. Tools and techniques for defending against spam and keeping it out of your organization's network have been the subject of many articles and books. While this problem definitely deserves the attention it is getting, there is another closely related problem lurking in your network.

This problem is how to stop users on your local network from sending unsolicited bulk email into the wilds of the Internet. Preventing users on your network from sending spam can be tricky. These individuals have already been given some level of trust if they have been allowed to connect to and use the resources of your organization's network. Your mail server has to be open to relaying from these user's computers if they are to send any email at all. And, tricks using reverse DNS lookups to test the validity of these user's hostnames won't do any good, at least if you have your DNS configuration in good working order. To cope with the problem, most organizations just post the industry standard acceptable use policy, maybe even make some users agree to it, and if the agreement is broken, the user's account can be removed to prevent further abuse.

This approach works, but there is a high price to pay. Valuable network bandwidth and CPU time may have already been consumed by tens of thousands of spam emails by the time the complaints start rolling in. Furthermore, your organization's reputation will have dropped a notch in the eyes of all of those who complained.

Sendmail Integration

After I received over five hundred complaint emails about a user on our network sending spam, I wrote a script to limit spam. The script is written in Perl and integrates with Sendmail by reading its log file once a minute and adding senders, or from addresses that meet certain criteria, to a file of senders who are blocked from relaying. To stop the senders from relaying, the script uses the check_mail ruleset in Sendmail. The check_mail ruleset is part of the anti-spam macro in Sendmail version 8.9, and must be installed in order for this script to work. I won't go into any detail on how to install and set up this ruleset because there have been several well-written articles on the subject. Michael Schwager's article, De-Spamming the Enterprise with Sendmail 8.9 (Sys Admin, February 1999) provides an excellent step-by-step rundown of setting up the ruleset.

The check_mail ruleset works by refusing to relay messages with “sent from” addresses matching those of known spammers. The addresses of known spammers, or senders you refuse to relay mail for, are kept in a database file in hash format called “access”. The Perl script integrates with the check_mail ruleset by reading the Sendmail log file once a minute and looking for lines containing a relay entry. The total number of messages sent from each sender since the mail log file was last rotated are counted. If they exceed a predefined limit, the sender is then blocked from relaying any more messages by adding their address to the access database that the check_mail ruleset uses. The script was designed to limit the amount of email a user can send in one day, so it is important to rotate the mail log file daily. If you are not already doing so, then add the following line to your crontab file with the appropriate path to the Sendmail log file on your system:

10 3 * * * root rotate -z -r 7 /var/log/maillog

Script Setup

Because limiting the amount of email that users can send is more of an art than a science, there are several configuration files you must set up in order for the script to work smoothly in your network.

1. Set $local_network to the path of defining the IP networks your organization uses. This should include any networks you relay mail for. The script uses this to distinguish between messages that originated locally on your network, and messages from the Internet. Chances are, these networks are already defined in another Sendmail configuration file. (Or they should be, if you have your mail server closed off to relaying mail for the entire Internet!)

2. The value of $mailcw should be the path to a file that has a list of domains for which Sendmail is accepting mail. A file with this information already exists -- Sendmail's configuration file. If you are defining the domains for which you accept mail inside of your rather then using the file, then you'll have to make this file.

3. The third file is a list of “from” addresses that are trusted and never blocked. The location of this file is specified in $address_dontblock. Any “from” address that is used when sending large volumes of legitimate mail on your network must be in this file. It is a good place for the “sent from” addresses used by any list servers on your network. These addresses are legitimate and will certainly generate lots of locally originated email, which you do not want to block.

4. The fourth file contains a list of usernames that are trusted. This file is defined in $user_dontblock. Usernames that will send large volumes of email that could be at any of the domains for which you accept mail should be in this file. This is a good place for system addresses like postmaster and MAILER_DAEMON. These addresses may appear without a domain portion to the “from” address if sent from the localhost, or they will appear at one of your local domains. You don't want these messages blocked from relaying.

Next, add an entry to your crontab file so the script is run every minute.

59 * * * * root nice -20

Make sure the entry in your crontab file is all on one line. I call the script “spamstopper”, and place it in a bin directory inside /usr/local where I also keep all of the configuration files specific to this script. I also use nice since this script can consume considerable resources if you relay large volumes of email. For more information on nice , see your system's online manual pages.

Finally, you must define some variables.

1. Define your local domain name in $localdomain.

2. Specify the email address to which warnings should be sent in $sysadmin. Every time a spammer is blocked, you will be emailed with the “from” address and the IP address from which the message originated, as well as the volume of messages. You can then use this information to enforce your acceptable use policy at your leisure.

3. Also, set $sendmail_prog to the location of your Sendmail binary, so this script can send the warning emails.

4. Set $spammer to the location of your access database, usually called access.db. Do not include the .db extension in the file name because that is assumed with a DBM file.

5. The $maillog variable should contain the path to your Sendmail log file, usually called maillog.

6. In $database_file, define the path to a file where the program can store a DBM style database of the log file lines it has parsed. By doing this, the program only parses the lines that have been appended to the mail log file since the last time it ran, rather than reading the entire file every time. The script will create a file when it is first executed. If the mail log file is rotated, the scriptwill detect it, delete this database, and start from the beginning of the fresh mail log file.

7. In $database_cache, define the path to a file where the program can store a second DBM database of recipients it has yet to match with senders. Sometimes a recipient address gets logged before the corresponding sender address. When this happens, the program does not know who the sender is until it is logged by sendmail at a later time. This database will make sure every email is accounted for.

8. Give the script a path to an empty text file where it can log blocked spammers, so you have a record of who was blocked. This is defined in the $blocked_file variable.

Setting the Threshold

Set the limits on the volume of messages your users can send in one day. This is the part that takes a little experimentation. There are several limits.

1. The $local_limit variable defines the number of messages a sender on your network (meaning they connected with an IP address defined in $local_network) can send in one day, or at least since your mail log file was last rotated. I use 300 on my network so that no users on my network (unless they are trusted and are in either the files $user_dontblock or $address_dontblock) can send more than 300 emails in a day. The script works not by stopping all spam but by limiting it before it gets bad. Because most spammers will send tens of thousands of messages in several hours, limiting them to sending 300 significantly decreases the damage.

2. A side effect of analyzing the mail log file in this fashion is that you not only catch spammers sending spam from your network, but you can also catch spammers from the outside targeting the users on your mail server. You know this has been done when the load average on your mail server climbs to dangerous levels, and when your mail log file shows one sender trying to send a message to tens of thousands of usernames@yourdomain, many of which don't exist. The $nonlocal_limit is the maximum amount of email a sender not on your local network can send. I suggest setting this value higher than the others, because legitimate list servers can generate a high volume of email with the same “from” address to users on your network. I use a value of 500 here.

Test It

It is imperative that you test the script and the above values before setting it up in a production environment. The thresholds on the volumes of different kinds of email can be extremely dependent on your environment and the volume of mail relayed on your network. To test, make some copies of several rotated or archived mail log files from your mail server. Then, make a copy of your access database file and point the script to these test files by defining the appropriate values. Run it and see which senders are being blocked by looking at the scripts log file specified in $blocked_file. You can safely run this script off to the side as long as you are not working with the live access database. If possible, use a log file from a time when you know you were attacked by a spammer and see if the limits you used will block them. It is very important to thoroughly test these thresholds or you could significantly disrupt the flow of email on your network.

Once this script is installed, and the thresholds on email volume are properly set, you should be able to reduce the complaint emails about spam originating from your network. With that said, I hope I'll see less spam in my inbox as a result of systems administrators putting this script to work.

About the Author

David Wartell is a network and systems analyst specializing in UNIX and the Internet. Currently he is busy working for his own company, ActionWeb Services, which specializes in Web page design and software engineering for Web sites. He can be reached at: