Building a Secure Web Site
Arthur Donkers
The World Wide Web has entered the business world and
is here to stay.
Apart from the more well-known application as Web server
on the
Internet, the Web is also used as a new way of distributing
information
within a company. With this free flow of information,
it has become very
important to protect private information from unwanted
peeking, so you
need to establish a secure environment between a client
(the browser)
and a server (the machine serving up the information).
This article will
show you how one can establish such a secure environment
by using
commonly accepted techniques.
Introduction
Although the reasons for security may seem obvious,
it is good policy to
review them consciously. There are two reasons for wanting
a secure
communications channel with your favorite Web site.
First of all, you
must establish the identity of the parties involved.
You want to be sure
that the other side really is the site they are claiming
to be. Sending
confidential information to someone you do not know
or do not trust can
have disastrous consequences. Imagine sending your credit
card number to
a Web site that claims to be an electronic shopping
mall, but in reality
is a machine operated by a malicious hacker. The person
that gathered
all of those credit card numbers could then use them
to buy all kinds of
products at your expense. So, you need to make absolutely
sure that the
server you are communicating with is the real thing.
(Server validation
does not, of course, eliminate the usual precautions
of verifying that
the business is actually reputable.) This identity verification
also
applies to the client sending a request to a server.
When the server is
used to store and retrieve confidential information,
it needs to be sure
of the identity of any client that requests information.
The second important point is the protection of the
data exchanged
between client and server. Once they have established
each others
identity, the data needs to be encrypted to make sure
that only the
client and server can read it. The reason for this is
simple. If that
same malicious hacker succeeded in eavesdropping on
the actual
communications, he or she could record the data exchanged
and examine it
afterwards. In this case the identity of this snooping
machine does not
matter as it only reads the data that flows between
the client and
server. Snooping is one thing, but if that hacker succeeded
in becoming
"the man in the middle," he or she could be
able to change the data
between receiving it and sending it to its destination.
So, encryption
offers a way of ensuring the integrity of the data.
For more detailed
information about cryptography in general, I recommend
Bruce Schneier's
book, Applied Cryptography. Be aware, however, that
if you are dealing
across national boundaries, import/export considerations
may come into
play. Export of sophisticated cryptographic software
from the United
States, although under review, is currently prohibited.
Thus, a U.S.
company dealing with foreign customers may be forced
to use a different
cryptographic strategy than one dealing solely with
U.S.-based
customers.
A Web Server
Before we can secure the traffic with a Web server,
we first need to
find out how the browser and server communicate. The
basic way of
communicating is to have the browser send a request
to the server. The
server processes this request and returns the resulting
data to the
browser. The browser will then interprete these data
and display them on
screen. The connection between the browser and client
is based on TCP,
so a virtual circuit is established between the browser
and server for
each request. The reason that TCP is used instead of
UDP has to do with
reliability. UDP datagrams can be lost without detection
by the sender.
This could mean that a request that was issued never
reached the server.
So, by using TCP the browser can be sure that a request
that has been
sent will eventually end up at the server.
A connection between browser and client exists during
the processing of
a request. So for one page, this might mean that a browser
will send
more than one request to the server and, thus, have
more than one
connection active at the same time. (e.g., when a page
contains an
image, this image is requested with a separate request
and the data are
processed separately). The question now rises which
of these requests
should be protected and which can be left unprotected.
The simplest approach would be to say that all forms
that contain
sensitive data must be protected from snooping. So,
when a user has
entered the necessary data in this form (e.g., his or
her name, phone
number, and credit card number) that data must be submitted
to the
server through a secure channel. Remember that entering
the data is a
local thing; it is the browser that reads the data from
the user, enters
it into the form, and displays it. Only when the user
presses the SUBMIT
button (or whatever it is called on the form), are the
data packed into
URL format and sent to the server. Normally, the server
will process
these data with a special CGI script. This script will
then process the
data and display information to the user. This receiving,
processing,
and displaying is also done in one connection. It is
mainly these types
of requests that need to be protected.
Secure Socket Layer
A special standard has been proposed (and accepted)
to protect the
sensitive Web requests as previously described. This
standard is called
SSL, which stands for Secure Socket Layer. This layer
sits on top of the
"normal" Berkeley sockets and provides a way
both to verify the identity
of the client and server and to encrypt the data exchanged
between the
client and server.
The Secure Socket Layer standard has been devised by
Netscape and has
been adopted by a large number of vendors. On Netscape's
Web site,
(http://www.netscape.com), you will find the specifications
of SSL (both
version 2.0 and 3.0) and a reference implementation
called SSLref. You
can use this reference implementation as an example
if you want to build
a secure communication package. As far as I know, this
SSLref
implementation is only available for U.S. residents.
SSL operates in two steps. The first step is used when
building a
connection. During the establishment of a connection
both client and
server exchange information about each other. This step
is shown in
Figure 1. (This figure is also available on Netscape's
Web site).
The following is a "free hand" description
of the SSL 3.0 protocol
definition. The parameters of the session are determined
by the SSL
Handshake Protocol. When an SSL client and server communicate,
they must
establish a protocol version, select cryptographic algorithms,
authenticate each other (optional), and use public-key
encryption
techniques to generate shared secrets. These processes
are performed in
the handshake protocol.
The client sends a ClientHello message to the server.
The server must
respond with a ServerHello message, or else a fatal
error will occur,
and the connection will fail. The ClientHello and ServerHello
messages
are used to establish the security relation between
client and server.
They establish the following attributes: protocol version,
session ID,
cipher suite, and compression method. Additionally,
two random values
are generated and exchanged: ClientHello.random and
ServerHello.random.
These can later be used to verify the identity of both
the client and
server.
Following the hello messages, the server will send its
security
certificate, if it needs to be authenticated. Additionally,
a
ServerKeyExchange message may be sent if necessary (e.g.,
if the server
has no certificate, or if the certificate is for signing
only and not
encrypting). If the server is authenticated, it may
request a
certificate from the client, if appropriate. In most
Web-related
communications, the client is NOT authenticated. For
on-line links in a
real client/server environment, the client does need
to be
authenticated. Next, the server will send the ServerHelloDone
message,
indicating that the hello message phase of the handshake
is complete.
The server will then wait for a client response.
If the server has sent a certificate request message,
the client must
send either the certificate message or a no certificate
alert. Then, the
client key exchange message is sent, and the content
of that message
will depend on the public key algorithm selected between
the client
hello and the server hello. If the client has sent a
certificate with
signing ability, a digitally signed message is sent
to explicitly verify
the certificate.
At this point, a ChangeCipherSpec message is sent by
the client, and the
client copies the PendingCipherSpec into the CurrentCipherSpec.
The
client then immediately sends the finished message under
the new
algorithms, keys, and secrets. In response, the server
will send its own
ChangeCipherSpec message, transfer the pending to the
current cipher
spec, and send its Finished message under the new cipher
spec. At this
point, the handshake is complete, and the client and
server can begin
exchanging application layer data. This data will be
encrypted using the
Cipher Spec (encryption method) agreed upon during the
ChangeCipherSpec
exchange.
As you can see from this description, a lot of information
is exchanged
between the client and server before even one byte of
data is
transmitted. The reason for this is that both client
and server must
agree upon the encryption method used and upon each
other's identity.
One concept to note in the process described above is
that of
certificates. A certificate is nothing more than an
electronic way to
verify the identity of the sender. To increase the reliability
of such a
certificate, it must be issued by an organization that
is trusted by
both communicating parties. This organization is normally
called a CA
(Certification Authority). An example of such a certificate
is shown in
Figure 2.
This certificate contains all kinds of administrative
information,
including the period for which it is valid, the organization
that issued
it, and for whom it was issued. But the most important
fields in the
certificate are the public key field and the signature
field. The public
key field is the public key of the server (See sidebar
"Public Key
Cryptography") that the client can use to decrypt
the information
received from the server. The signature field is used
to verify the
certificate itself. It can be decrypted with the public
key of the
server and contains a hash of the complete certificate
(using the MD5
hash algorithm). The public key of a server is used
during the initial
phase to verify the identity of the server (or client
if need be).
To be able to use SSL as a secure communications mechanism,
you need
both a browser and server that support this protocol.
Fortunately, most
available browsers have support for SSL already built
in. Note that due
to the ITAR regulations, the U.S.-based browsers are
available in both
an export version (using weak keys) and a domestic version
(using strong
keys).
The other side of the connection, the Web server program
or http daemon,
is a different story. Some companies use Netscape's
Commerce Server,
which also has SSL support built in. However, a number
of http servers
are available through the Internet. The most well-known
of these is
Apache, an http server that is based on the NCSA daemon.
The Apache
daemon is well-known for its speed, reliability, and
extensibility. It
is very simple to add modules to the Apache daemon (e.g.,
to support
anonymous ftp via your Web pages with built-in SSL support).
Getting and Building SSL
The getting and building of a SSL library has some legal
implications.
There is a version of SSL written by Eric A. Young available
from:
http://www.psy.uq.oz.au/~ftp/ \
Crypto as SSLeay
that can be used in both commercial and noncommercial
applications. It
is completely compatible with the SSL specs. However,
as I explained
before, SSL uses the RSA algorithm to exchange the encryption
keys and
to verify the identity of the server and client. The
RSA algorithm is
patented in the United States. So, if you want to use
the publicly
available version of SSL, you should probably use it
with the RSAREF
(available through RSA, http://www.rsa.com) implementation
instead of
the one available in the SSL distribution. This automatically
limits you
to applications in the noncommercial environment. Before
you download
anything and start deploying it, however, please remember
the earlier
import/export caveat and check your local legislation.
After you have downloaded the source file, you can uncompress
and untar
it in an appropriate working directory. You can then
follow the
instructions in the INSTALL file to build the libraries
and binaries on
your platform, which is all rather straightforward.
After installing SSL
you will have a library and a number of binaries you
can use to generate
certificates, as well as to encrypt and decrypt data.
Once you have SSL running, you can generate your own
certificates with
the req program, which enables you to create a number
of certificates.
The options for this command are shown in Figure 3.
You can use these
(dummy) certificates for testing purposes. With the
following command
(assuming SSL is installed in the /usr/local/ssl directory),
you can
generate such a certificate:
cd /usr/local/ssl/certs
req -new -x509 -days 60 -out test.pem -keyout test.pem
Running this command will generate the output and series
of questions
shown in Figure 4. This will create a new certificate,
with the
information you entered here. This is a self-signed
certificate, so the
issuer and subject fields should be the same. The passphrase
you entered
at the beginning is used to encode the generated file,
so somebody
snooping around on your system cannot read the private
key. If you had
used the -nodes option while generating, this question
for a passphrase
would have been skipped, and your certificate would
not be secure
against reading.
You can verify the generated certificate with the verify
command:
verify /usr/local/ssl/certs/test.pem
Note that this only works well when you are using a
nonencrypted
certificate. It is important to become familiar with
the different
commands in the SSL suite, as you need them to manage
the certificates
for your secure Web server.
Building Apache
Once you have the SSL code up and running, you need
to get the plain
Apache sources. These are available through the Apache
organization and
can be downloaded from:
http://www.apache.org
As of this writing, two versions of the Apache daemon
are available,
1.0.5 and a 2.0 beta. If you want to use Apache in a
production
environment, I recommend using version 1.0.5, which
has proven to be
stable and reliable. The 2.0 beta version is experimental
and might give
you some problems under heavy load. Also, if you are
already using
Apache version 1.0.3, I suggest you upgrade to version
1.0.5. The older
version contains a nasty security bug that may compromise
your server.
At the Apache home page you will also find links to
patch files to add
SSL support to the version of Apache you are using.
It may be a good
thing to download this patch file as well. If you cannot
find it here,
you may also try the SSL homepage. This also has references
to the
Apache patches.
After downloading the tar file for Apache, you can uncompress
and untar
it. You should then edit the Configuration file to reflect
your system
and run the Configure program to build a Makefile. Once
you have this
Makefile, you can call make, and the Apache daemon will
be built. In
this way, you will get a "plain" Apache daemon
without SSL support. I
recommend doing this to see if you can build a properly
working Apache
daemon on your system before introducing the additional
complication of
SSL.
Once you have a properly running Apache daemon, you
can apply the SSL
patches. Probably the best thing to do is to make a
copy of the plain
directory and name it something like apache-1.0.5.ssl.
This is a way to
ensure you also have the original sources available.
Next, you can apply
the patches according to the instructions in the patch
file, run
Configure again, and build the new Apache daemon. Make
sure that you
have uncommented the last line in the Configuration
file, so SSL support
is included:
# Apache inverts the module list. SSL must go first to
# fake basic authorization. So, uncomment this line to
# add SSL:
Module ssl_module apache_ssl.o
After building, you will get a program called httpsd,
for secure http
daemon. You can then install it, preferably next to
the normal http
daemon.
Configuring httpsd
Once you have the httpsd built, you need to configure
it before you can
use it. I will only show you the SSL-related configuration
items, the
rest is normal Apache configuration. The first file
you need to adapt is
the httpd.conf file. I copied the existing file and
named the copy
httpsd.conf. When you start the httpsd server you can
use the -f option
to specify the configuration file, in this case /etc/httpsd.conf.
First,
you should change the portnumber the server will listen
on. A normal
http server will listen on port 80, whereas a secure
http server
normally listens on port 443. Remember to add this port
to your
/etc/services file, so you can use it by name rather
than by number.
Below is an excerpt from the services file:
https 443/tcp
The additional configuration items are related to the
location of your
server certificate and the certificate of the CA. All
of these items are
explained in the documentation that comes with the patch
file. Figure 5
shows an example of the configuration.
Note that you really should generate a certificate with
encryption to
prevent anyone from reading the server's private key
from the file.
This, however, brings one complication - once you start
the secure Web
server, you need to enter the decryption key so the
daemon can read the
file. This password is read from a tty (SSL insists
on that), so
starting the secure httpsd completely automatically
is not possible.
Using a Secure Web Site
Once you have succeeded in creating a certificate and
starting your
httpsd, you can start accessing it with your browser.
The URL for these
secure pages would be something like:
https://scarab.reseau.nl/reseau.html
Note the https instead of the usual http. Accessing
these pages from
Netscape will issue a warning. The self-signed certificate
of your
server will not immediately be accepted by Netscape.
It does not
recognize the CA (you) that signed the certificate,
and Netscape 1.x may
refuse to access the pages. If you are using Netscape
2.x, you are
presented with a warning screen, shown in Figure 6,
that guides you
through accepting the certificate of the unknown server.
You can specify whether you want to reject the certificate,
accept it
for this session, or accept it always. If you accept
it, you will see
the page being loaded, and the key icon at the bottom
of the Netscape
window will change from broken to a normal key. When
accepting the
certificate, you can request more information about
the certificate
being used. An example of this is shown in Figure 7.
Conclusions
By using the SSL protocol you can secure your communications.
You can
secure not only your Web traffic, but your other traffic
as well (e.g.,
telnet and ftp). The only drawback of all this is the
export limiting
regulations that restrict the export of strong encryption
software from
the U.S. In my opinion, this will lead to a situation
in which every
country makes their own standard, which may hinder international
communications.
About the Author
Arthur Donkers graduated from the Delft University
of Technology with a
degree in Electrical Engineering, with a major in Computer
Architecture.
Since then he has worked for a several software houses
in the
Netherlands and participated in a number of major projects.
His primary
field of interest in these projects has been, and still
is,
datacommunications. especially the integration of multi-vendor
networksystems. Due to the demand in the market, Le
Reseau now focuses
on network security-related projects and consultancy.
The last four
years he worked as an independant consultant for his
own company, Le
Reseau (french for "The Network").
|