Cover V10, I02
Article
Listing 1

feb2001.tar


Overcoming Large Outlook Distribution Lists: An Exercise in Sendmail Aliases

Davin Petersen

I recently found a significant flaw in Outlook 2000, related to "Distribution Lists". Our company keeps in touch with other equipment vendors by blasting sell or buy emails to multiple vendors at one time. Some of our salespeople use lists to notify customers of special product offerings or upcoming feature releases. The size of our Distribution Lists ranges from 20 email addresses to several hundred addresses.

The Problem

Outlook repeatedly times out when trying to send mail to the members of these lists. During the SMTP conversation, Outlook stops sending data; it is apparently waiting for something. Sendmail, on the other hand, is also waiting for more data from Outlook. The only common thread seems to be that the mail hangs around the 12th to 15th email address. In trying to diagnose the problem I eliminated the user's PC, the Outlook distribution list, and the size of the message. Since none of these things seemed to make any difference I assumed the problem must be in the conversation between Sendmail and Outlook.

We use a Red Hat 6.1 Intel box with Sendmail 8.9.3 for mail relay. Because Sendmail is built for routing mail, and Outlook was not having a problem sending "regular" email, I thought it was best to have my Linux box do the list processing for Outlook. When Outlook did send email to a large list, it took a few minutes to complete the transaction. An obvious advantage of having Linux/Sendmail do the list is that, from Outlook's perspective, you're only sending the message to one recipient.

The Solution

The source of the problem eluded me. Was it a TCP timeout issue in 98, or Linux? Poor Sendmail configuration? A "feature" of Outlook? Because I didn't have a lot of time to search for patches to upgrade Sendmail, Outlook, Linux, or 98, I needed another solution. What I didn't need is a large-scale list server package (majordomo, listserv, etc.) to administer in addition to the rest of my daily chores. Simplicity of the system was key, as well as ease of maintenance.

The first part of solving this problem was getting Distribution Lists (DLs) in Outlook out to the mail server. Without this, I'd have to find something commercial. This was not a big deal since Outlook can export a DL by saving it to your disk. DLs can be saved in several formats -- .txt being the most attractive. The only hangup was that the contents of an Outlook DL in text format looks something like this:

 Frank Zappa (E-mail) frank@zappa.net
 Computer Store computer@store.net
 someotheremail@someotherplace.net (E-mail) someotheremail@someotherplace.net
The "(E-mail)" was on some lines, not on others, and the alignment of characters was not consistent. It appears that, in the saving process, the text was aligned on tabs but was converted to spaces before being written. There was also a "header" on the file. This was not an acceptable form; it would have to be changed.

The next step is to get the DL to the server. My solution needed to consider first the user's current skills and second the tools available to me (Sendmail, Perl, sed, awk, grep, etc.). The simplest solution was to email the DL list, manually, in the body of an Outlook message. This employs both something the user already knows and leaves the rest of the work for the back-end tools.

The process for a user to send an update over to the server is as follows:

1. Save the distribution list as a text file

2. "Insert as text" into the body of an email message

3. Send the message to the update list alias

This system gets me two things:

1. Delegation -- The list transmitted to the mail server without adding any overhead to me

2. Decentralized Management -- A method to update the list whenever the "list maintainer" wants to do so

3. Centralized Data -- Multiple people can use the same lists

People who work with dealer/customer email have their own lists, so I didn't have to worry about multiple people updating the same list. It's a small advantage to be able to access another person's DL. Previously, you would have to forward the DL to another Outlook user and then worry about how old the recipient's copy of the list was.

A Bit About Mail Aliases

At the core of most list-processing packages is an email alias that points to a file that contains all the email addresses for that list. In Sendmail, the /etc/aliases file holds the aliases. The basic form of an alias is:

 alias: other email address
 e.g. admin: davin@foobar.net
There are two special forms of an alias: one for lists, one to pass the message through an external program. Both lists are necessary to make the mail work.

 list-alias: ":include:/path/to/some/list-file"
 program-alias: "|/path/to/some/program"
After modifying the /etc/aliases, run the newaliases command to force Sendmail to re-read the /etc/aliases file.

The format of a list-file is one email address per line. newaliases does not need to run after a list has been updated. When Sendmail passes an email message to the program-alias, it sends the whole message, mail headers included, to the program on STDIN.

listmaker.pl

The first hurdle to overcome is re-formatting the text DL into a list file. I chose Perl to do the task, but I could have chosen sed and maybe even grep. The script, called listmaker.pl, is fairly straightforward (see Listing 1). It takes a file on STDIN and sends the output to STDOUT. Basically, the script chops the end of line off the input string, skips blank lines, skips lines without email addresses, and skips email header lines. If it passes those tests, it grabs the email address and prints it to STDOUT. The last line is for debugging and testing.

The regular expression that collects the email address uses .+, the not-so-greedy version of .*. This prevents too much of the input string from getting sucked up by the regular expression. The \W is a Perlism for "any whitespace". The parentheses are a regex meta-character that create a group around the characters that make up the email address. After starting with some more complicated expressions, and not getting the desired result, I consulted a friend to help figure this out. One of his Perl books happened to have a regex for this very thing. The potential problem with the regex is that the email must be at the end of the line.

The command line usage of the script is:

 listmaker.pl <infile >outfile
If you prefer pipes:

 cat infile | listmaker.pl > outfile
Since the script itself does not contain the filename to send the output to, it can be used for any number of lists. I chose to store the lists in the directory /etc/mail/lists. Since Sendmail under Red Hat runs as user/group mail/mail, the /etc/mail/lists directory must be mode 755, owned by the mail user and mail group. Sendmail-spawned processes will create all the lists stored there.

To install this program and the list into the /etc/aliases file, I added the following two lines:

 sundealers: ":include:/etc/mail/aliases"
 update_sundealers: "|/usr/local/bin/listmaker.pl > /etc/mail/lists/sundealers"
Once newaliases is run, everything should be ready to go.

Conclusion

There are several weaknesses in this script. There is no accountability track or authorized users list, so there is a risk that anyone could update a list with bad data. One potential solution is to write an entry to the syslog using logger whenever someone sends a message to listmaker.pl. The From: and To: address should get logged so you know who updated which list.

Also, the update process for the users is not trivial. There is no easy way in Outlook to do this, and mistakes are common. If you're not paying attention, missing one option on a dialog box ruins the whole process. Outlook, by default, wants to attach a Distribution List in MIME format. If you're exporting the Distribution List, Outlook wants to use RTF as the default, not ASCII. The next planned improvement on this script will be to parse a MIME-encoded Distribution List. This allows users to simply forward the Distribution List to the update email address.

The listmaker.pl script itself only does rudimentary checking on the contents of each line. For example, the test for header line matches only a colon. Since colons are not allowed in email addresses, this didn't seem like a bad choice. If, by some freakish chance, a "valid" line with an email address had a colon in it (perhaps in the display name to the left of the email address) that email address would get thrown out.

All in all, the system has worked fairly well. If you use or modify this script, please let me know via email: dpetersen@cosmostech.com.

Davin Petersen (dpetersen@cosmostech.com) is a Sr. Systems Engineer for Cosmos Technology, a leading provider of Storage and UNIX systems (www.cosmostech.com). He got an early start with UNIX and tries to help others make informed technology decisions. When he's not working, Davin chases after his 3-year-old and crawls with his 6-month-old.