Article

Building a Secure Web Site

Arthur Donkers

The World Wide Web has entered the business world and is here to stay. Apart from the more well-known application as Web server on the Internet, the Web is also used as a new way of distributing information within a company. With this free flow of information, it has become very important to protect private information from unwanted peeking, so you need to establish a secure environment between a client (the browser) and a server (the machine serving up the information). This article will show you how one can establish such a secure environment by using commonly accepted techniques.

Introduction

Although the reasons for security may seem obvious, it is good policy to review them consciously. There are two reasons for wanting a secure communications channel with your favorite Web site. First of all, you must establish the identity of the parties involved. You want to be sure that the other side really is the site they are claiming to be. Sending confidential information to someone you do not know or do not trust can have disastrous consequences. Imagine sending your credit card number to a Web site that claims to be an electronic shopping mall, but in reality is a machine operated by a malicious hacker. The person that gathered all of those credit card numbers could then use them to buy all kinds of products at your expense. So, you need to make absolutely sure that the server you are communicating with is the real thing. (Server validation does not, of course, eliminate the usual precautions of verifying that the business is actually reputable.) This identity verification also applies to the client sending a request to a server. When the server is used to store and retrieve confidential information, it needs to be sure of the identity of any client that requests information.

The second important point is the protection of the data exchanged between client and server. Once they have established each others identity, the data needs to be encrypted to make sure that only the client and server can read it. The reason for this is simple. If that same malicious hacker succeeded in eavesdropping on the actual communications, he or she could record the data exchanged and examine it afterwards. In this case the identity of this snooping machine does not matter as it only reads the data that flows between the client and server. Snooping is one thing, but if that hacker succeeded in becoming "the man in the middle," he or she could be able to change the data between receiving it and sending it to its destination. So, encryption offers a way of ensuring the integrity of the data. For more detailed information about cryptography in general, I recommend Bruce Schneier's book, Applied Cryptography. Be aware, however, that if you are dealing across national boundaries, import/export considerations may come into play. Export of sophisticated cryptographic software from the United States, although under review, is currently prohibited. Thus, a U.S. company dealing with foreign customers may be forced to use a different cryptographic strategy than one dealing solely with U.S.-based customers.

A Web Server

Before we can secure the traffic with a Web server, we first need to find out how the browser and server communicate. The basic way of communicating is to have the browser send a request to the server. The server processes this request and returns the resulting data to the browser. The browser will then interprete these data and display them on screen. The connection between the browser and client is based on TCP, so a virtual circuit is established between the browser and server for each request. The reason that TCP is used instead of UDP has to do with reliability. UDP datagrams can be lost without detection by the sender. This could mean that a request that was issued never reached the server. So, by using TCP the browser can be sure that a request that has been sent will eventually end up at the server.

A connection between browser and client exists during the processing of a request. So for one page, this might mean that a browser will send more than one request to the server and, thus, have more than one connection active at the same time. (e.g., when a page contains an image, this image is requested with a separate request and the data are processed separately). The question now rises which of these requests should be protected and which can be left unprotected.

The simplest approach would be to say that all forms that contain sensitive data must be protected from snooping. So, when a user has entered the necessary data in this form (e.g., his or her name, phone number, and credit card number) that data must be submitted to the server through a secure channel. Remember that entering the data is a local thing; it is the browser that reads the data from the user, enters it into the form, and displays it. Only when the user presses the SUBMIT button (or whatever it is called on the form), are the data packed into URL format and sent to the server. Normally, the server will process these data with a special CGI script. This script will then process the data and display information to the user. This receiving, processing, and displaying is also done in one connection. It is mainly these types of requests that need to be protected.

Secure Socket Layer

A special standard has been proposed (and accepted) to protect the sensitive Web requests as previously described. This standard is called SSL, which stands for Secure Socket Layer. This layer sits on top of the "normal" Berkeley sockets and provides a way both to verify the identity of the client and server and to encrypt the data exchanged between the client and server.

The Secure Socket Layer standard has been devised by Netscape and has been adopted by a large number of vendors. On Netscape's Web site, (http://www.netscape.com), you will find the specifications of SSL (both version 2.0 and 3.0) and a reference implementation called SSLref. You can use this reference implementation as an example if you want to build a secure communication package. As far as I know, this SSLref implementation is only available for U.S. residents.

SSL operates in two steps. The first step is used when building a connection. During the establishment of a connection both client and server exchange information about each other. This step is shown in Figure 1. (This figure is also available on Netscape's Web site).

The following is a "free hand" description of the SSL 3.0 protocol definition. The parameters of the session are determined by the SSL Handshake Protocol. When an SSL client and server communicate, they must establish a protocol version, select cryptographic algorithms, authenticate each other (optional), and use public-key encryption techniques to generate shared secrets. These processes are performed in the handshake protocol.

The client sends a ClientHello message to the server. The server must respond with a ServerHello message, or else a fatal error will occur, and the connection will fail. The ClientHello and ServerHello messages are used to establish the security relation between client and server. They establish the following attributes: protocol version, session ID, cipher suite, and compression method. Additionally, two random values are generated and exchanged: ClientHello.random and ServerHello.random. These can later be used to verify the identity of both the client and server.

Following the hello messages, the server will send its security certificate, if it needs to be authenticated. Additionally, a ServerKeyExchange message may be sent if necessary (e.g., if the server has no certificate, or if the certificate is for signing only and not encrypting). If the server is authenticated, it may request a certificate from the client, if appropriate. In most Web-related communications, the client is NOT authenticated. For on-line links in a real client/server environment, the client does need to be authenticated. Next, the server will send the ServerHelloDone message, indicating that the hello message phase of the handshake is complete. The server will then wait for a client response.

If the server has sent a certificate request message, the client must send either the certificate message or a no certificate alert. Then, the client key exchange message is sent, and the content of that message will depend on the public key algorithm selected between the client hello and the server hello. If the client has sent a certificate with signing ability, a digitally signed message is sent to explicitly verify the certificate.

At this point, a ChangeCipherSpec message is sent by the client, and the client copies the PendingCipherSpec into the CurrentCipherSpec. The client then immediately sends the finished message under the new algorithms, keys, and secrets. In response, the server will send its own ChangeCipherSpec message, transfer the pending to the current cipher spec, and send its Finished message under the new cipher spec. At this point, the handshake is complete, and the client and server can begin exchanging application layer data. This data will be encrypted using the Cipher Spec (encryption method) agreed upon during the ChangeCipherSpec exchange.

As you can see from this description, a lot of information is exchanged between the client and server before even one byte of data is transmitted. The reason for this is that both client and server must agree upon the encryption method used and upon each other's identity.

One concept to note in the process described above is that of certificates. A certificate is nothing more than an electronic way to verify the identity of the sender. To increase the reliability of such a certificate, it must be issued by an organization that is trusted by both communicating parties. This organization is normally called a CA (Certification Authority). An example of such a certificate is shown in Figure 2.

This certificate contains all kinds of administrative information, including the period for which it is valid, the organization that issued it, and for whom it was issued. But the most important fields in the certificate are the public key field and the signature field. The public key field is the public key of the server (See sidebar "Public Key Cryptography") that the client can use to decrypt the information received from the server. The signature field is used to verify the certificate itself. It can be decrypted with the public key of the server and contains a hash of the complete certificate (using the MD5 hash algorithm). The public key of a server is used during the initial phase to verify the identity of the server (or client if need be).

To be able to use SSL as a secure communications mechanism, you need both a browser and server that support this protocol. Fortunately, most available browsers have support for SSL already built in. Note that due to the ITAR regulations, the U.S.-based browsers are available in both an export version (using weak keys) and a domestic version (using strong keys).

The other side of the connection, the Web server program or http daemon, is a different story. Some companies use Netscape's Commerce Server, which also has SSL support built in. However, a number of http servers are available through the Internet. The most well-known of these is Apache, an http server that is based on the NCSA daemon. The Apache daemon is well-known for its speed, reliability, and extensibility. It is very simple to add modules to the Apache daemon (e.g., to support anonymous ftp via your Web pages with built-in SSL support).

Getting and Building SSL

The getting and building of a SSL library has some legal implications. There is a version of SSL written by Eric A. Young available from:

http://www.psy.uq.oz.au/~ftp/ \
Crypto as SSLeay

that can be used in both commercial and noncommercial applications. It is completely compatible with the SSL specs. However, as I explained before, SSL uses the RSA algorithm to exchange the encryption keys and to verify the identity of the server and client. The RSA algorithm is patented in the United States. So, if you want to use the publicly available version of SSL, you should probably use it with the RSAREF (available through RSA, http://www.rsa.com) implementation instead of the one available in the SSL distribution. This automatically limits you to applications in the noncommercial environment. Before you download anything and start deploying it, however, please remember the earlier import/export caveat and check your local legislation.

After you have downloaded the source file, you can uncompress and untar it in an appropriate working directory. You can then follow the instructions in the INSTALL file to build the libraries and binaries on your platform, which is all rather straightforward. After installing SSL you will have a library and a number of binaries you can use to generate certificates, as well as to encrypt and decrypt data.

Once you have SSL running, you can generate your own certificates with the req program, which enables you to create a number of certificates. The options for this command are shown in Figure 3. You can use these (dummy) certificates for testing purposes. With the following command (assuming SSL is installed in the /usr/local/ssl directory), you can generate such a certificate:

cd /usr/local/ssl/certs
req -new -x509 -days 60 -out test.pem -keyout test.pem

Running this command will generate the output and series of questions shown in Figure 4. This will create a new certificate, with the information you entered here. This is a self-signed certificate, so the issuer and subject fields should be the same. The passphrase you entered at the beginning is used to encode the generated file, so somebody snooping around on your system cannot read the private key. If you had used the -nodes option while generating, this question for a passphrase would have been skipped, and your certificate would not be secure against reading.

You can verify the generated certificate with the verify command:

verify /usr/local/ssl/certs/test.pem

Note that this only works well when you are using a nonencrypted certificate. It is important to become familiar with the different commands in the SSL suite, as you need them to manage the certificates for your secure Web server.

Building Apache

Once you have the SSL code up and running, you need to get the plain Apache sources. These are available through the Apache organization and can be downloaded from:

http://www.apache.org

As of this writing, two versions of the Apache daemon are available, 1.0.5 and a 2.0 beta. If you want to use Apache in a production environment, I recommend using version 1.0.5, which has proven to be stable and reliable. The 2.0 beta version is experimental and might give you some problems under heavy load. Also, if you are already using Apache version 1.0.3, I suggest you upgrade to version 1.0.5. The older version contains a nasty security bug that may compromise your server.

At the Apache home page you will also find links to patch files to add SSL support to the version of Apache you are using. It may be a good thing to download this patch file as well. If you cannot find it here, you may also try the SSL homepage. This also has references to the Apache patches.

After downloading the tar file for Apache, you can uncompress and untar it. You should then edit the Configuration file to reflect your system and run the Configure program to build a Makefile. Once you have this Makefile, you can call make, and the Apache daemon will be built. In this way, you will get a "plain" Apache daemon without SSL support. I recommend doing this to see if you can build a properly working Apache daemon on your system before introducing the additional complication of SSL.

Once you have a properly running Apache daemon, you can apply the SSL patches. Probably the best thing to do is to make a copy of the plain directory and name it something like apache-1.0.5.ssl. This is a way to ensure you also have the original sources available. Next, you can apply the patches according to the instructions in the patch file, run Configure again, and build the new Apache daemon. Make sure that you have uncommented the last line in the Configuration file, so SSL support is included:

# Apache inverts the module list. SSL must go first to
# fake basic authorization. So, uncomment this line to
# add SSL:

Module ssl_module apache_ssl.o

After building, you will get a program called httpsd, for secure http daemon. You can then install it, preferably next to the normal http daemon.

Configuring httpsd

Once you have the httpsd built, you need to configure it before you can use it. I will only show you the SSL-related configuration items, the rest is normal Apache configuration. The first file you need to adapt is the httpd.conf file. I copied the existing file and named the copy httpsd.conf. When you start the httpsd server you can use the -f option to specify the configuration file, in this case /etc/httpsd.conf. First, you should change the portnumber the server will listen on. A normal http server will listen on port 80, whereas a secure http server normally listens on port 443. Remember to add this port to your /etc/services file, so you can use it by name rather than by number. Below is an excerpt from the services file:

https           443/tcp

The additional configuration items are related to the location of your server certificate and the certificate of the CA. All of these items are explained in the documentation that comes with the patch file. Figure 5 shows an example of the configuration.

Note that you really should generate a certificate with encryption to prevent anyone from reading the server's private key from the file. This, however, brings one complication - once you start the secure Web server, you need to enter the decryption key so the daemon can read the file. This password is read from a tty (SSL insists on that), so starting the secure httpsd completely automatically is not possible.

Using a Secure Web Site

Once you have succeeded in creating a certificate and starting your httpsd, you can start accessing it with your browser. The URL for these secure pages would be something like:

https://scarab.reseau.nl/reseau.html

Note the https instead of the usual http. Accessing these pages from Netscape will issue a warning. The self-signed certificate of your server will not immediately be accepted by Netscape. It does not recognize the CA (you) that signed the certificate, and Netscape 1.x may refuse to access the pages. If you are using Netscape 2.x, you are presented with a warning screen, shown in Figure 6, that guides you through accepting the certificate of the unknown server.

You can specify whether you want to reject the certificate, accept it for this session, or accept it always. If you accept it, you will see the page being loaded, and the key icon at the bottom of the Netscape window will change from broken to a normal key. When accepting the certificate, you can request more information about the certificate being used. An example of this is shown in Figure 7.

Conclusions

By using the SSL protocol you can secure your communications. You can secure not only your Web traffic, but your other traffic as well (e.g., telnet and ftp). The only drawback of all this is the export limiting regulations that restrict the export of strong encryption software from the U.S. In my opinion, this will lead to a situation in which every country makes their own standard, which may hinder international communications.

About the Author

Arthur Donkers graduated from the Delft University of Technology with a degree in Electrical Engineering, with a major in Computer Architecture. Since then he has worked for a several software houses in the Netherlands and participated in a number of major projects. His primary field of interest in these projects has been, and still is, datacommunications. especially the integration of multi-vendor networksystems. Due to the demand in the market, Le Reseau now focuses on network security-related projects and consultancy. The last four years he worked as an independant consultant for his own company, Le Reseau (french for "The Network").