Article

De-Spamming the Enterprise with Sendmail 8.9

Michael Schwager

Unsolicited bulk email, or "spam" as it is not-so-lovingly referred to, is a growing problem in almost every organization. Spammers, the originators of such communications, daily spew forth thousands of messages advertising everything from the latest get-rich-quick scheme to pornography sites. Not only are such missives potentially offensive, they also consume significant system and people resources, if allowed past your email gateway. Thus, filtering spam has become an important element in systems administration. Running a UNIX system as your email gateway, however, provides you with one of the best spam-fighting tools around - Sendmail. In the January article, "Sendmail as Gatekeeper", I discussed the basics of what Sendmail does and how it accomplishes its tasks. It's time to look at our raison d'etre: Sendmail's anti-spam rulesets. This discussion covers Sendmail 8.9 and above.

Spam-Specific Rulesets

Sendmail, as you may recall, filters and redirects incoming email based on various rulesets and delivery methods. These rulesets provide a flexible and highly configurable method for processing incoming communications, and include the following techniques for de-spamming your machines:

Ruleset	Purpose
check_relay	Deny mail from clients found in Paul Vixie's Realtime Blackhole List.
check_mail	Deny mail from known spamming email addresses in the envelope "From:", such as friend@public.com.
check_rcpt	Deny mail based on the envelope "To: ". This ruleset is checked before Ruleset 0, and thus is preferred. Also, rules are included in this set to prevent relaying. If a connection comes from a host that is not in your domain, and the recipient is not destined for your domain, deny the mail.
Header Rules	You can write rules based on the contents of any mail message header, and deny it. In addition to patterns, you can also match regular expressions.

Generate the Config File

Central to Sendmail's functionality is its so-called config file, sendmail.cf. To begin our quest to squelch spam, let's generate a config file, using Sendmail's m4 method. As long as your version of m4 is fairly modern, it should work. Check the README file in the cf subdirectory of the Sendmail install hierarchy for details about whose versions of m4 are likely to fail, and where to get the latest GNU m4 (which will surely work).

cd into the Sendmail hierarchy. When you extracted the Sendmail tar file, it should have created a subdirectory that looks like sendmail-8.x.y, where x and y represent the current release numbers. cd there.
cd to the cf subdirectory.
make a directory ../../Sendmail.inst for Sendmail install.
execute: tar cf - . | (cd ../../Sendmail.inst ; tar xf -)
cd ../../Sendmail.inst

Let's assume we're going to install Sendmail for a master mail server for the domain "antispam.com". All the hosts inside antispam.com send mail to this server, and from there through the firewall to the Internet. Antispam.com receives mail for all domains inside the antispam.com domain (i.e., for anything.antispam.com). I recommend installing Sendmail this way:

mkdir Antispam within the current (Sendmail.inst) directory
cd Antispam
Create a file called "build" that contains:

  ###BEGIN###
  #!/bin/sh
  
  cp antispam.cf antispam.cf.old
  m4 ../m4/cf.m4 antispam.mc > antispam.cf
  cp antispam.cf /usr/local/sendmail-r8.9/lib/sendmail.cf
  ###END###

Later, if you are creating new configs for other hosts on this machine, comment out the last line. You don't want to overwrite a working config.

chmod 755 build

Prepare m4:

cp ../domain/generic.m4 ../domain/antispam.m4
ln -s ../domain/antispam.m4 antispam.m4
ls ../ostype, and choose the file that corresponds to your operating system type.
Edit antispam-head.mc, and include the following lines (Solaris 2 is shown):

  ### BEGIN ###
  #
  #  This is taken from the  generic configuration file for SunOS 5.x
  #  (a.k.a. Solaris 2.x)
  #
  # The `DOMAIN' file is in ../domain... unfortunately, we can't group it
  # here with the other m4 files...
  divert(0)dnl
  VERSIONID(`@(#)antispam-solaris2.mc  8.8 (Berkeley) 5/19/98')
  OSTYPE(solaris2)dnl
  DOMAIN(antispam)dnl
  MAILER(local)dnl
  MAILER(smtp)dnl
  define(`confMAX_MESSAGE_SIZE', `5000000')dnl
  define(`confCHECKPOINT_INTERVAL', `5')dnl
  define(ANTISPAM_INSTALLDIR, `/opt/local/sendmail-r8.9')dnl
  define(`confCW_FILE', `ANTISPAM_INSTALLDIR/lib/sendmail.cw')dnl
  define(`confCR_FILE', `ANTISPAM_INSTALLDIR/databases/relay-domains')dnl
  define(`ALIAS_FILE', `ANTISPAM_INSTALLDIR/lib/aliases')dnl
  define(`HELP_FILE', `ANTISPAM_INSTALLDIR/lib/sendmail.hf')dnl
  define(`STATUS_FILE', `ANTISPAM_INSTALLDIR/lib/sendmail.st')dnl
  define(`confPRIVACY_FLAGS', \
    `authwarnings,noexpn,novrfy,noreceipts,restrictqrun')dnl
  define(`confSMTP_LOGIN_MSG', `[$j Your Greeting Here $b]')dnl
  define(`confRECEIVED_HEADER', `[$?sfrom $s $.$?_($?s$|from $.$_) $.by \
    $j (confCF_VERSION)$?r with $r$. id $i$?u for $u; $|; $.$b]')
  FEATURE(access_db, hash -o ANTISPAM_INSTALLDIR/databases/access.db)dnl
  FEATURE(blacklist_recipients)dnl
  divert(-1)
  The above changes the default for:
  FR-o /etc/mail/relay-domains
  O AliasFile=/etc/mail/aliases
  O HelpFile=/etc/mail/sendmail.hf
  O StatusFile=/etc/mail/sendmail.st
  divert(0)dnl
  #
  define(`MAILER_TABLE', \
    `hash ANTISPAM_INSTALLDIR/databases/mailertable.db')dnl
  ### END ###

We will be able to reuse this header file not only on Internet mail relays, but also on internal machines. The information inside the file should be constant across our configurations.

Create your file antispam.mc. This is the m4 file upon which we'll build (see the antispam.mc file available at the Sys Admin Web site: www.samag.com or ftp.mfi.com in /pub/sysadmin).

m4 Background

Remember that our command is:

m4 ../m4/cf.m4 antispam.mc > antispam.cf

The purpose of m4 is to replace text with other text (e.g., the define's listed above). Here, it opens the ../m4/cf.m4 file, which figures out where we are in the filesystem and sets the _CF_DIR_ macro so that other m4 files will be able to find files they need to include. It calls the cfhead.m4 file, which defines lots of options and sets up basic m4 diversions. cfhead.m4 calls proto.m4, which contains most of the rulesets. m4's output is the text found among the commands in the m4 file after it has completed all the replacements.

A special note about diversions: I found m4 diversions confusing, but they are really like placeholders. They say, "This is slot X. When I say 'undivert X', output whatever text you have been storing in that slot." Diversions 1-9 are possible. Diversion 0 is basically a default, since it brings you back to normal output. Diverting to anything other than 0-9 discards the text. This is why you'll see a lot of divert(-1) in the m4 macro files. Using an illegal diversion buffer is a way of including comments.

You can find more information about m4 by doing a man m4, but for any serious m4 work, you may find the man pages sparse. A good introduction to m4 is hard to find, but Sun's Answerbook contains some useful information in the Programming Utilities section. The Info files for m4 from the GNU distribution are very informative as well.

Sendmail's Databases

We're almost ready to try out our initial Sendmail config file. Before that, though, Sendmail needs some database files to do its work. The files needed with our current configuration and their type are:

Pathname	Type
/usr/local/sendmail-r8.9/lib/sendmail.cw	text
/usr/local/sendmail-r8.9/databases/relay-domains	text
/usr/local/sendmail-r8.9/databases/mailertable.db	hash
/usr/local/sendmail-r8.9/databases/access.db	hash
/usr/local/sendmail-r8.9/lib/aliases	hash

Files that you can create for Sendmail, but which will not be discussed here, include:

/usr/local/sendmail-r8.9/lib/sendmail.hf	text
/usr/local/sendmail-r8.9/lib/sendmail.st	text

File Descriptions

sendmail.cw - The sendmail.cw file is useful if you receive mail for a domain name other than your own. For example, say our domain is antispam.com. We receive mail for ouch.com. In this case, we would want the sendmail.cw file to contain simply:

ouch.com

Other domains can be added, one line at a time. We won't work with this file here, but since Sendmail includes it by default in the domain/generic.m4 file, create a 0-length file for now:

touch /usr/local/sendmail-r8.9/lib/sendmail.cw .

relay-domains - This file is somewhat related to the sendmail.cw file. In this file, you would include the domain names of hosts that are allowed to relay mail through you. For example, ouch.com sends mail to the Internet through antispam.com. Normally, we would not want this to happen, because we don't want people outside our domain to be able to relay through us to the Internet. However, since we are hosting ouch.com, we will make an exception. This file would contain that information. For right now, let's make that an empty file as well:

touch /usr/local/sendmail-r8.9/databases/relay-domains .

mailertable.db - This file is used to "override routing for particular domains," in the words of Eric Allman. This means, for example, that mail destined for ouch.com arrives at our antispam.com mail relay machine. If we have an entry like the following in the mailertable:

ouch.com      esmtp:internal.antispam.com

then whenever mail arrives destined for the ouch.com domain, our server will in turn connect to internal.antispam.com for further delivery. Let's create a mailertable file containing just that information. Use your favorite text editor to create the file and include the aforementioned text. Spaces or tabs separate the two fields. After editing, you need to build the database; I recommend creating a script that will renew all the databases that we'll be working with. Here is the script:

### BEGIN
#!/bin/sh

PATH=$PATH:/usr/local/sendmail-r8.9/bin ; export PATH

cd /usr/local/sendmail-r8.9/databases
makemap hash mailertable.db < mailertable
makemap hash access.db < access
newaliases
### END

I'll call it mkdbs. Don't forget to make it executable.

access.db - The access database is the core of the anti-spam capabilities of Sendmail 8.9 and above (as of this writing). The access database will contain all sorts of entries dealing with allowing or denying mail from and to hosts, domains, or usernames. The access database looks like this:

###BEGIN


     cyberspammer.com        550 We don't accept mail from spammers
     okay.cyberspammer.com   OK
     sendmail.org            OK
     192.168.0               RELAY
     myfriend@ REJECT
     friend@public.com       REJECT
     slammer@                "550 Envelope Mail address rejected.
                             See http://www.antispam.com"
###END

User your favorite editor to create the access file.

The access database has a key and a value like the mailertable. This time the key is a host, domain name, or username. The following table describes the values; I am quoting verbatim from the README file in Sendmail's cf directory:

OK Accept mail even if other rules in the running ruleset would reject it.

RELAY Allow domain to relay through your SMTP server. RELAY also serves an implicit OK for the other checks.

REJECT Reject the sender/recipient with a general purpose message.

DISCARD Discard the message completely using the $#discard mailer.

### any text Where ### is an RFC 821 compliant error code, and "any text" is a message to return for the command.

You can see that RELAY duplicates the work of the relay-domains file, with one exception. The access database is used in the check_mail ruleset and the relay-domains file is not. Normally this will not be of any concern.

aliases - This is the usual Sendmail aliases file. Create a skeleton aliases file or copy your existing one to that file. For our purposes, we won't discuss it. Here is a simple aliases file:

###BEGIN
##
#  Aliases can have any mix of upper and lower case on the left-hand side,
#     but the right-hand side should be proper case (usually lower)
#
#     >>>>>>>>>>      The program "newaliases" will need to be run after
#     >> NOTE >>      this file is updated for any changes to
#     >>>>>>>>>>      show through to sendmail.
##

# Following alias is required by the mail protocol, RFC 822
# Set it to the address of a HUMAN who deals with this system's mail
# problems.
Postmaster: yournamehere

# Alias for mailer daemon; returned messages from our MAILER-DAEMON
# should be routed to our local Postmaster.
MAILER-DAEMON: postmaster

# Aliases to handle mail to programs or files, eg news or vacation
# decode: "|/usr/bin/uudecode"
nobody: /dev/null

#######################
# Local aliases below #
#######################

###END

Now that we have all the files we need, run the build script that you created earlier, then run the mkdbs script. Fix any errors that may come up. One possible error is that Sendmail may complain:

/etc/sendmail.cf: line 102: fileclass: cannot open \
  /usr/local/sendmail-r8.9/lib/sendmail.cw: Group writable directory

This is Sendmail helping you make your host more secure. The fix is to chmod g-w every subdirectory in that hierarchy. For example:

chmod g-w /usr
chmod g-w /usr/local
chmod g-w /usr/local/sendmail-r8.9
chmod g-w /usr/local/sendmail-r8.9/lib

If any of these are links to another directory, makemap will complain. You will have to go back and edit the antispam-head.mc file and change the ANTISPAM_INSTALLDIR so that the directory hierarchy contains no symlinks. The problem arises because a symlink will show permissions for everybody.

Now we have installed Sendmail, and we have rudimentary anti-spam measures. As of this moment, we have:

The check_mail ruleset:

Disallows mail from parties who use hostnames in the envelope
address (i.e., during "MAIL From:" in the SMTP conversation) that are
not found in DNS. Remember, all "From:" addresses can be forged.
Disallows mail from parties who do not use hostnames in the envelope address.
Disallows mail from senders/domainnames/hostnames that you may have entered into the access database.

The check_rcpt ruleset:

Disallows relaying through our host. Mail that comes from a machine not located in our domain had better be going to an address within our domain, or we will assume we're being hijacked. We can specifically allow hosts to relay through us by using the access database.
Disallows mail to recipients that we may have entered into the access database.

The check_relay ruleset:

Is run just as soon as the client connects to our mail host.
Looks up connecting host's hostname and/or IP address in the access database to allow or disallow connections.
Is used by the RBL (see below).

Testing, Testing

Make sure all database files and the aliases file are created as described earlier. Run the mkdbs script; as long as it exits without error, we should be ready to run. There are two ways to test Sendmail. The first is to use it in "address test mode". This can be useful if you don't have a machine to play with. You can safely run Sendmail off to the side without affecting current mail processing. However, it's not a real-world test. A more rigorous test would be to run Sendmail on your machine.

Address Test Mode

Address test mode goes like this. Type:

/usr/lib/sendmail -bt

If you want to see the operation of the rulesets in detail, type:

/usr/lib/sendmail -bt -d21.4 -d21.12

Use the correct pathname if /usr/lib is incorrect. At this point, Sendmail responds with:

WARNING: Ruleset Local_check_mail has multiple definitions
WARNING: Ruleset Local_check_rcpt has multiple definitions
ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
Enter <ruleset> <address>
>

Ignore those warnings; they will not affect the operation or anti-spam capabilities of Sendmail in any way.

Address test mode is similar to a debugging mode. The idea is that you supply a ruleset and an address, and Sendmail runs the address through the given ruleset. It shows you what it is doing as it operates on an address. For example, try:

check_mail <schwager@cyberspammer.com>

Sendmail will reject this spammer; the last line should say:

rewrite: ruleset 184 returns: $# error $@ 5 . 7 . 1 $: 550 We don't \
  accept mail from spammers.

Try some other addresses in the access database and observe address test mode in action. To exit address test mode, type a control-D at the prompt.

Address test mode can be tricky to use. For example, the client_name macro can affect operation of some of the rulesets, but in address test mode it's undefined.

To test Sendmail more realistically, we can run it and talk to it through port 25. If the machine you're testing on is not currently being used for mail processing, run the newly compiled Sendmail:

/usr/lib/sendmail -bD

The -bD keeps it in the foreground and makes it easy to kill in your terminal window with a control-C.

Find another machine that you can use for running telnet. Now telnet into port 25 of the machine running this Sendmail:

telnet hostname.antispam.com 25

You should see a greeting message, like:

220 [hostname.antispam.com ESMTP Antispam Mailer Tue, 05 Nov 1998 \
  14:29:01 -0500 (CDT)]

Now say hello with helo:

helo localhost

It will reply back with a message beginning with "250". Let's test our anti-relay rules. Send it the following two lines:

mail from: <schwager@antispam.com>
rcpt to: <noone@somewhereindns.com>

The mail addresses should contain valid domain names. You will see:

550 <noone@somewhereindns.com>... Relaying denied

Type quit to exit. Note that our local domain "antispam.com" was given, yet we were still denied. This is because right now the check_rcpt ruleset denies mail from any host connecting to our machine if it's destined for outside our domain. Try connecting to port 25 of our anti-spam test machine from itself, and executing the same SMTP conversation. It will not fail.

To fix this, we put our domain name in the relay-domains file. I look at the relay-domains file as a short, static list of our local domain and other domains we may relay for. The access database can be used for the same purpose, but that file will likely get cluttered over time. So, put the local domain in the relay-domains file; for example:

antispam.com

Kill and restart Sendmail, then telnet in from another host and rerun the previous test. It should now work; after the "rcpt to:", Sendmail will say:

250 <noone@somewhereindns.com>... Recipient ok

Try some other things; for example you might try:

mail from: <friend@public.com>

Sendmail will say,

550 <friend@public.com>... Access denied

If you put this in the "rcpt to:" section, Sendmail says:

550 <friend@public.com>... Mailbox disabled \
                           for this recipient

The check_rcpt ruleset checked the access database and found the entry. Because we do not want to receive from friend@public.com, we also cannot send to friend@public.com; the access database does not discriminate.

The RBL

Unless the spammer bends the rules, we will get spammed even using the check_mail and check_rcpt rulesets. Often the problem lies in the fact that we have to see spam to block spam. However, it is possible to be more proactive.

Paul Vixie, a long-time Internet guru, began a service that allows you to reject spam before it reaches your site. Here's how it works:

In the Sendmail config file, rules in the check_relay ruleset check the identity of the connecting client. Sendmail looks up the client's IP address on Vixie's DNS servers. If it gets a affirmative response, Sendmail will reject the connection. The service is known as the Mail Abuse Protection System Realtime Blackhole List (or MAPS RBL, or just RBL).

The RBL is turned on by a line near the beginning of your antispam.mc file; the FEATURE(rbl). I highly recommend you leave it in.

Summary

Congratulations! You now have a potent weapon in the war against spam. Next month, we'll conclude this Sendmail series with a discussion of more proactive approaches to dealing with the problem of spam. Be sure to check the Sys Admin Web site (www.samag.com) this month for more details about how spammers spam, a detailed look at Sendmail's anti-spamming rulesets, the config files mentioned in the article, and a bibliography of anti-spam resources.

About the Author

Mike Schwager is a contractor specializing in UNIX and the Internet. He has spent the past 15 years writing C and Perl code, shell scripts, and maintaining systems in the corporate and educational environments. Email him at Michael@Schwager.com or visit http://come.to/lanicservices.