Overcoming
Large Outlook Distribution Lists: An Exercise in Sendmail Aliases
Davin Petersen
I recently found a significant flaw in Outlook 2000, related to
"Distribution Lists". Our company keeps in touch with
other equipment vendors by blasting sell or buy emails to multiple
vendors at one time. Some of our salespeople use lists to notify
customers of special product offerings or upcoming feature releases.
The size of our Distribution Lists ranges from 20 email addresses
to several hundred addresses.
The Problem
Outlook repeatedly times out when trying to send mail to the members
of these lists. During the SMTP conversation, Outlook stops sending
data; it is apparently waiting for something. Sendmail, on the other
hand, is also waiting for more data from Outlook. The only common
thread seems to be that the mail hangs around the 12th to 15th email
address. In trying to diagnose the problem I eliminated the user's
PC, the Outlook distribution list, and the size of the message.
Since none of these things seemed to make any difference I assumed
the problem must be in the conversation between Sendmail and Outlook.
We use a Red Hat 6.1 Intel box with Sendmail 8.9.3 for mail relay.
Because Sendmail is built for routing mail, and Outlook was not
having a problem sending "regular" email, I thought it
was best to have my Linux box do the list processing for Outlook.
When Outlook did send email to a large list, it took a few minutes
to complete the transaction. An obvious advantage of having Linux/Sendmail
do the list is that, from Outlook's perspective, you're
only sending the message to one recipient.
The Solution
The source of the problem eluded me. Was it a TCP timeout issue
in 98, or Linux? Poor Sendmail configuration? A "feature"
of Outlook? Because I didn't have a lot of time to search for
patches to upgrade Sendmail, Outlook, Linux, or 98, I needed another
solution. What I didn't need is a large-scale list server package
(majordomo, listserv, etc.) to administer in addition to the rest
of my daily chores. Simplicity of the system was key, as well as
ease of maintenance.
The first part of solving this problem was getting Distribution
Lists (DLs) in Outlook out to the mail server. Without this, I'd
have to find something commercial. This was not a big deal since
Outlook can export a DL by saving it to your disk. DLs can be saved
in several formats -- .txt being the most attractive. The only
hangup was that the contents of an Outlook DL in text format looks
something like this:
Frank Zappa (E-mail) frank@zappa.net
Computer Store computer@store.net
someotheremail@someotherplace.net (E-mail) someotheremail@someotherplace.net
The "(E-mail)" was on some lines, not on others, and the
alignment of characters was not consistent. It appears that, in the
saving process, the text was aligned on tabs but was converted to
spaces before being written. There was also a "header" on
the file. This was not an acceptable form; it would have to be changed.
The next step is to get the DL to the server. My solution needed
to consider first the user's current skills and second the
tools available to me (Sendmail, Perl, sed, awk, grep,
etc.). The simplest solution was to email the DL list, manually,
in the body of an Outlook message. This employs both something the
user already knows and leaves the rest of the work for the back-end
tools.
The process for a user to send an update over to the server is
as follows:
1. Save the distribution list as a text file
2. "Insert as text" into the body of an email message
3. Send the message to the update list alias
This system gets me two things:
1. Delegation -- The list transmitted to the mail server without
adding any overhead to me
2. Decentralized Management -- A method to update the list
whenever the "list maintainer" wants to do so
3. Centralized Data -- Multiple people can use the same lists
People who work with dealer/customer email have their own lists,
so I didn't have to worry about multiple people updating the
same list. It's a small advantage to be able to access another
person's DL. Previously, you would have to forward the DL to
another Outlook user and then worry about how old the recipient's
copy of the list was.
A Bit About Mail Aliases
At the core of most list-processing packages is an email alias
that points to a file that contains all the email addresses for
that list. In Sendmail, the /etc/aliases file holds the aliases.
The basic form of an alias is:
alias: other email address
e.g. admin: davin@foobar.net
There are two special forms of an alias: one for lists, one to pass
the message through an external program. Both lists are necessary
to make the mail work.
list-alias: ":include:/path/to/some/list-file"
program-alias: "|/path/to/some/program"
After modifying the /etc/aliases, run the newaliases
command to force Sendmail to re-read the /etc/aliases file.
The format of a list-file is one email address per line. newaliases
does not need to run after a list has been updated. When Sendmail
passes an email message to the program-alias, it sends the whole
message, mail headers included, to the program on STDIN.
listmaker.pl
The first hurdle to overcome is re-formatting the text DL into
a list file. I chose Perl to do the task, but I could have chosen
sed and maybe even grep. The script, called listmaker.pl,
is fairly straightforward (see Listing 1). It takes a file on STDIN
and sends the output to STDOUT. Basically, the script chops the
end of line off the input string, skips blank lines, skips lines
without email addresses, and skips email header lines. If it passes
those tests, it grabs the email address and prints it to STDOUT.
The last line is for debugging and testing.
The regular expression that collects the email address uses .+,
the not-so-greedy version of .*. This prevents too much of
the input string from getting sucked up by the regular expression.
The \W is a Perlism for "any whitespace". The parentheses
are a regex meta-character that create a group around the characters
that make up the email address. After starting with some more complicated
expressions, and not getting the desired result, I consulted a friend
to help figure this out. One of his Perl books happened to have
a regex for this very thing. The potential problem with the regex
is that the email must be at the end of the line.
The command line usage of the script is:
listmaker.pl <infile >outfile
If you prefer pipes:
cat infile | listmaker.pl > outfile
Since the script itself does not contain the filename to send the
output to, it can be used for any number of lists. I chose to store
the lists in the directory /etc/mail/lists. Since Sendmail
under Red Hat runs as user/group mail/mail, the /etc/mail/lists
directory must be mode 755, owned by the mail user and mail group.
Sendmail-spawned processes will create all the lists stored there.
To install this program and the list into the /etc/aliases
file, I added the following two lines:
sundealers: ":include:/etc/mail/aliases"
update_sundealers: "|/usr/local/bin/listmaker.pl > /etc/mail/lists/sundealers"
Once newaliases is run, everything should be ready to go.
Conclusion
There are several weaknesses in this script. There is no accountability
track or authorized users list, so there is a risk that anyone could
update a list with bad data. One potential solution is to write
an entry to the syslog using logger whenever someone sends
a message to listmaker.pl. The From: and To:
address should get logged so you know who updated which list.
Also, the update process for the users is not trivial. There is
no easy way in Outlook to do this, and mistakes are common. If you're
not paying attention, missing one option on a dialog box ruins the
whole process. Outlook, by default, wants to attach a Distribution
List in MIME format. If you're exporting the Distribution List,
Outlook wants to use RTF as the default, not ASCII. The next planned
improvement on this script will be to parse a MIME-encoded Distribution
List. This allows users to simply forward the Distribution List
to the update email address.
The listmaker.pl script itself only does rudimentary checking
on the contents of each line. For example, the test for header line
matches only a colon. Since colons are not allowed in email addresses,
this didn't seem like a bad choice. If, by some freakish chance,
a "valid" line with an email address had a colon in it
(perhaps in the display name to the left of the email address) that
email address would get thrown out.
All in all, the system has worked fairly well. If you use or modify
this script, please let me know via email: dpetersen@cosmostech.com.
Davin Petersen (dpetersen@cosmostech.com)
is a Sr. Systems Engineer for Cosmos Technology, a leading provider
of Storage and UNIX systems (www.cosmostech.com).
He got an early start with UNIX and tries to help others make informed
technology decisions. When he's not working, Davin chases after
his 3-year-old and crawls with his 6-month-old.
|