Web Site Management by Mail
Managing a Web site can be easy, provided you have sufficient
to have a comfortable telnet shell working on your desktop.
great when you work at the office and your Web server
is on the same
LAN, but becomes a nightmare when you're connected through
and must cross several unknown network paths. I figured
response time would normally be adequate for this purpose,
to use email to feed a simple mail message parser that
is able to
transform raw HTML-formatted text into full-fledged
This article describes the basic implementation of the
system through a
set of shell scripts that are easily extensible to meet
What I Needed to Do
I wanted to keep the functionality simple: creating
html files and
putting them in a selected directory. Although the objective
straightforward, the speed of the connection can present
logistical difficulties. Thus, I decided to prepare
a raw HTML document
(i.e., only the body, excluding any header, footer,
style, etc.) and
send it by email to an email alias, named mail2html
corresponding to a pipe to a parser coded in shell.
Then, the challenge
was to instruct the parser to do something variable
instructions included at the top of email message.
The Basic Requirement
My Web site is bi-lingual, offering two parallel threads,
one in English
and the other in my national language, Italian. This
means that two
different directory trees exist, to be managed separately
with their own
styles, headers, and related issues. The parser should
therefore be able
to act differently depending on the selected language,
build the target
file accordingly, and put it in the correct place. Also,
it is necessary
to ensure that only messages from trusted users are
a log is necessary to verify that the system is working,
what it did,
The parser was the most challenging, due to the various
string processing. The problem was eased by use of regular
and simple UNIX commands such as cut, sed, ed, etc.
manipulation done with ed scripts could have been implemented
in awk or
Perl, but I chose ed because of my greater familiarity
with that tool
and time constraints.
To make message processing with ed scripts feasible,
a syntax for
incoming messages was defined. Messages have two sections:
section and an html body, separated by one or more blank
parameter section is bounded by BEGIN_PARAMS and END_PARAMS
tags at the
beginning of the line. Comments in the usual shell syntax
#) may also be included. Inside the parameters section,
various tags may
be included, one per line. I included tags for compiling
<TITLE> html tag fields (included in html file
filename, and language used for processing. Adding other
tags is only
matter of changing some lines in the code. Tag syntax
straightforward: TAG=value<EOL>. The following
is an example of a
typical parameter sections:
T=This is an example of title
H=Heading of my example
The body of the message must be html-formatted, since
no processing is
done on it.
The code, shown in Listing 1, starts with variable initialization
directory changes. Then, standard input (i.e., the
email message coming
through a pipe) is captured in a temporary file. The
email message is
composed of three sections, not two, as required by
the defined syntax.
The intervening mailer has added email headers at the
top of the file.
Mail headers are useful to identify the sender of the
message, which is
checked against a trusted user. If the sender does not
match, a record
is added to error file, and mail is sent to the trusted
user to warn him
or her of the potential intrusion.
No damage could be applied to the main html tree, since
any file is
created in an unprivileged user's home directory (defined
variable) under the public_html directory (default for
URLs). A word about this directory: because the final
aim is to publish
documents in the main tree, and because I had no time
to implement smart
algorithms (see To Do's in script), a mirror image of
the main tree
directories must be present under a dummy user's tree.
This ensures that
no strange manipulation can be applied, because files
can be saved only
in existing directories. At the end (and after a thorough
performed by the pages' author), a cron job can copy
the new tree over
the main one. As a security precaution, the dummy user
is disabled by an
* in the password field of the /etc/passwd file. The
mail2html script is
owned by the dummy user with a umask of 750 (-rwxr-x---).
important because the trusted user is coded in clear,
and nobody outside
must be able to read the file (a real hacker could patch
the mail job
while it's in the queue for sending).
Finally, I suggest putting a .htpasswd file to further
to dummy user's public_html directory (see httpd documentation
details). Together these precautions should avoid any
misuse of the
program. After checking the new Web pages with a standard
(Mosaic, Netscape or whatever), you can move the documents
to the final
online location using the issue script included in the
pre-publishing task, changing the date, can be performed
by the setdate
script (see Listing 2 and Listing 3).
The message is parsed three times to extract the three
separate files. The first file contains mail headers,
and the "From:"
field is checked against the trusted users file. If
the user is OK, the
job continues with parameter section parsing; otherwise
an error message
is generated and sent to the trusted user by email,
as well as written
in the error log file.
Finally, parsing of the HTML page text starts. This
through two small shell scripts, build and setdate.
Both are quite
simple. build accepts a rough HTML file as input and
with a standard header and footer in the selected language.
calls setvar to initialize local variables (See Listing
4 and Listing 5.).
setdate sets the date in the relevant field of the file,
again in the
style of the selected language. The work is done by
a series of sed
statements that are strictly related to header and footer
behavior of the scripts is determined by the parsed
After building the file, you must insert the values
you selected for
parameters. Once more, an ed "here" script
easily does the job, by
changing TITLE and HEADING keywords with values supplied
parameters section. Finally, the file is moved to its
defined in the parameters, and a success entry is added
to the logfile.
Scratch files are cleaned up before exiting.
Setup of mail2html is quite simple and can follow the
steps shown below,
or be modified based on the site's configuration. Just
be careful about
1) Identify dummy user (for instance, guest). Create
it if necessary,
and disable it immediately by putting a * in /etc/passwd's
2) Copy mail2html and other support scripts to an appropriate
directory, under dummy user's home.
3) Change ownership of the above files to the dummy
user, with mode 700
4) Edit the mail aliases file (usually /usr/lib/aliases)
and add the
mail2html: "| /dummy_user_home/dir/mail2html"
5) Run newaliases (or a corresponding utility on your
system) to update
the aliases database.
6) Change to dummy user's home and create public_html
7) Create, starting at public_html subdir, a mirror
of the directory
tree ~httpd/htdocs (or whatever is the base html tree).
8) Create, in public_html subdir, a .htaccess file containing
information for the directory and its subtree. The file
must be world
readable. Here is an example:
AuthName this item
allow from all
deny from ncsa.uiuc.edu
9) Create .htpasswd file (as indicated in AuthUserFile
tag of .htaccess
file) containing username and CRYPTED password of users
access it. It is best to use the ~httpd/support/htpasswd
utility, but an
edited /etc/passwd line may be used. Again, the file
must be world
10) Edit the mail2html and other support scripts to
variables (path, trusted user, etc.).
11) Create log- and error files by touching them.
12) You're done. Try creating an html file, put a parameter
top, and send it by mail to mail2html. You should find
html document exactly where you specified it should
be (with F=filename
tag), under dummy user's public_html subtree. You may
watch it with a
standard html browser.
If you follow the above setup strategy, no problems
Localizing the scripts may introduce errors, however,
often with the
symptom of nothing happening. Specifically, in the event
goes wrong, no output file will be generated. In troubleshooting,
can simulate the mail pipe by feeding the script directly
from a shell
account. Do this by sending the original mail message
to yourself and
saving it in a scratch file. Then process the saved
email with the
following command and watch the results:
cat mailfile | mail2html
If that technique does not show the source of the problem,
on shell debugging with:
cat mailfile | sh -x mail2html
This will show the commands as they are executed and
help solve the
This set of scripts provides a simple solution to the
problem I faced
due to low-speed connectivity. With a minimal amount
of setup time, I
can "telecommand" a remote Web site in a reasonable
functionality of the scripts and the appearance of the
produced can be modified by simply changing the support
scripts and the
standard headers and footers. The scripts and the configuration
procedure described here provide rudimentary security
using the concept
of "hide and protect." Feel free to embellish
the code, but use it at
your own risk.
About the Author
Luca Salvadori is head of the Information Systems Dept.
at LABEN S.p.A.,
a leading European Space Electronics supplier. His experience
PDP-11 (whose manuals taught him English while at University)
through Windows, PCs, and networking. He manages the
company's web site
(http://www.laben.it/) as well as (in his free time
with a bunch of
friends) another Linux box (http://aeroweb.lucia.it/)
aviation, his real, irresistible love. As a hobby, he
aerobatics. Luca can be reached at: firstname.lastname@example.org.