Article

Determining a Web Server Connection Rate to the Internet

Gilbert Held

Within a relatively short period of time the World Wide Web has risen from obscurity to become the constant focus of attention of Time and Newsweek and numerous technical and trade-related publications. Although a large number of Fortune 500 organizations have connected servers to the Internet, the small and medium-sized organizations are really fueling its growth.

Obtaining a Web address places the old "Mom and Pop" Corporation on equal footing with GM, Ford, and Exxon with respect to providing information to Web surfers. But, because most corporations do not have the resources of a Fortune 500 Corporation and most, if not all, Fortune 500 companies are very interested in minimizing their expenditure of resources, a common problem facing both large and small organizations, including businesses, universities, and government agencies, involves selecting an appropriate Internet connection.

The Connection Problem

A key problem associated with connecting a Web server to the Internet involves determining an appropriate connection method and selecting an operating rate for that connection method. Today, you can select from a dozen or more connection options, ranging from low-speed analog-leased lines, whose operating rates are governed by the type of modems used, to a variety of digital services. Concerning the latter, many Internet access providers support the use of leased 56 Kbps, fractional T1, full T1, fractional T3, and full T3 connectivity options. In a year of political rhetoric, this is surely a choice, not an echo!

The problem facing network managers and administrators concerning the connection of a Web server to the Internet is twofold. If you select a connection method using a line operating rate that is too slow, you run the risk of alienating potential customers as they grow frustrated attempting to surf through pages on your Web server. If you select a connection method that results in a line operating rate that exceeds the access requirements of users, you will more than likely waste corporate funds by providing an excessive transmission capability. Thus, the Web server connection process needs a mechanism to provide network managers and administrators with information concerning an appropriate Internet connection rate. This mechanism can be obtained through a traffic estimation process, which is the focus of this article. It provides a starting point for understanding both Web server traffic and the constraints associated with enabling Internet users to access your Web server.

Although the traffic estimation process can provide insight concerning an applicable connection method and operating rate for the attachment of the corporate, government, or university Web server, the process itself does not examine the practicality of the connection method. That is, since a Web server is normally placed on a local area network (LAN) and the LAN is actually connected via a router to the Internet, you must also examine the data flow on the LAN. Doing so will enable you to determine whether the Web server can obtain the necessary amount of LAN bandwidth to enable the WAN link to be fully utilized. If not, you might consider LAN segmentation, moving other servers or workstations to another LAN segment, or restructuring your LAN using another method. One or more of those methods would enable the Web server to obtain sufficient LAN bandwidth to fully utilize the WAN operating rate.

The Traffic Estimation Process

To begin the traffic estimation process, first, you must review the design of your organization's Web pages, including their links to one another. You will need a series of web pages to illustrate this process, so assume the design of the M&P Corporation Web server resulted in the creation of seven Web pages whose hyperlink relationships are illustrated in Figure 1.

As shown in Figure 1, the Internet user first obtains a display of the organization's home page. From the home page, the user can select one of three hyperlinks, resulting in the display of a second Web page (the first page in a two-tiered display). Note that each page provides a link back to the organization's home page, and the bottom page in each two-tiered display also provides a link back to the top page. Thus, the hyperlink relationship enables a user to easily jump from one display to another within a series of three topics or subjects. However, the sample design does not facilitate intertopic connectivity. Instead, the person must first return to the home page and select a hyperlink associated with a new topic or subject. This design is not representative of a surfer-friendly Web page, but it suffices to illustrate the methodology behind the estimation process.

To illustrate the traffic estimation process requires knowledge of the characters or bytes associated with each page and the average number of pages each user accessing the M&P Corporation server will view. For example, assume the home page, which is the only page always accessed by a person viewing this site, consists of two GIF images, each requiring 95,000 bytes of storage and 800 bytes of text. Then, a person viewing this home page will receive 190,800 bytes of data.

For simplicity, assume that each of the pages on tier 1 contains one GIF image requiring 75,000 bytes of storage and 1,000 bytes of text, and that each page in the second tier consists of 1500 bytes of text. The two key estimates for determining a line operating rate are the number of hits you expect during a busy hour and the quantity of traffic that must be transmitted in response to each hit.

To avoid ambiguity, I define a "hit" as a Uniform Resource Locator (URL) reference to a page on your Web server. I define the "busy hour" as the 1-hour period that will have the highest number of attempted or actual hits. Assume that, based upon a survey of existing Web sites providing similar service to yours, you have determined that those sites average 5,000 hits per day, with a daily busy hour averaging 750 hits. Although every person accessing your Web server will initially access your home page, you estimate that subsequent page references will result in a distribution such that access to the home page will represent 60 percent of all hits. Tier 1 and tier 2 page accesses will represent 25 and 15 percent of all hits, respectively. Thus, the average data transfer per hit for the M&P Corporation Web server becomes:

(190,800 * .6) + (76,000 * .25) + (1500 * .15) = 133,705 bytes

Now that you have obtained the average number of bytes expected to be transferred per server hit, you can use that value in conjunction with the estimated number of busy hour hits to compute the line operating rate necessary to accommodate expected busy hour traffic. That is, since you expect 750 hits per busy hour, and the average transmission response per hit represents the transfer of 133,705 bytes, a reasonable estimate of the line operating rate needed becomes:

750 hits/hour * 133,705 bytes/hit * 8 bits/byte
_______________________________________________ = 222,841 bps
60 seconds/minute * 60 min/hour

For estimating purposes, you can ignore the overhead of the transport protocol. Although a more precise calculation could be made, it is easier to simply use caution if the estimated operating rate requirement is within 10 to 15 percent of the line rating.

Based upon the preceding, you might be tempted to call your Internet access provider and request the installation of a 256 Kbps fractional T1 line. However, you should examine another important constraint that will govern the ability of your organization to use the bandwidth of your Internet access connection. That constraint is the method by which you anticipate connecting your Web server to the Internet.

Examining LAN Utilization

Figure 2 illustrates the method by which most Web servers are connected to the Internet. In Figure 2, a leased line at a 256 Kbps operating rate, which was previously determined suitable for supporting Internet traffic to the Web server, is shown installed between an Internet access provider and the M&P Corporation LAN. At the customer site, the line is terminated by a router, which in turn is connected to a LAN on which the Web server resides. The figure shows a bus-structured Ethernet LAN. But in actuality, any type of LAN could be used as long as the router supports the LAN architecture and you are able to install a TCP/IP stack on the server that also supports the selected LAN.

We have made several assumptions concerning the composition of the M&P Corporation's home page and other pages residing on the corporate server, which were used to compute a WAN operating rate, but we still do not know if that operating rate is actually achievable. We have not yet considered the constraints imposed on the Web server's ability to transmit data due to the traffic generated by other workstations and servers that may reside on the same LAN. The networking constraints of the internal network will serve as a transmission cap for the WAN and will govern the maximum practical operating rate of the transmission facility linking the Internet access provider to the corporate Web server.

Determining a Maximum Operating Rate

Assume that in addition to the Web server located on the Ethernet LAN, there are 23 workstations and another server on that network, resulting in a total of 25 devices sharing the transmission bandwidth of the LAN. This means that the Web server, on average, will obtain 1/25th of the 10 Mbps bandwidth of the LAN, or 400,000 bps. Since the efficiency of an Ethernet LAN severely degrades when the network utilization level exceeds 60 percent, this would further reduce the sustained supportable average bandwidth obtained by the Web server to 400,000*60%, or 240,000 bps. Thus, the LAN itself acts as a constraint limiting the line operating rate from the Internet access provider to the customer premises to an average of 240,000 bps. Now that we have computed an upper limit resulting from LAN activity, we can compare that limit to the previously computed WAN operating rate.

Previously, we computed the necessary WAN operating rate to support 750 hits during the busy hour to be approximately 223 Kbps and selected a 256 Kbps fractional T1 connection to provide the necessary WAN connection. In examining the traffic on the LAN, we determined that the Web server would be capable of obtaining an average sustainable bandwidth of 240 Kbps. Since the average sustainable LAN bandwidth exceeds the required WAN operating rate necessary to support 750 hits during the busy hour, we do not have to consider modifying the use of the LAN to boost the use of Web server bandwidth.

To facilitate the selection of an appropriate Web server WAN connection, I developed a spreadsheet model using Lotus 123 Release 5 for Windows. The cell contents of the spreadsheet model are listed in Figure 3, and the resulting spreadsheet display is shown in Figure 4.

Note that the worksheet in Figure 4 was based upon a three-tiered Web server page relationship that may or may not apply to your organization. I also assumed that there were a total of 25 stations on the LAN on which the server would reside, and assumed the number of hits during the busy hour to total 750. Thus, there are numerous assumptions you will need to make to select an appropriate WAN operating rate.

Once you commence your Internet connection several Internet access providers can offer services to simplify the problems associated with making these assumptions. They have introduced a relatively new type of service in which users are connected via a full T1 line, but are only billed for the organization's sustained usage that occurs within a predefined operating rate tier. For example, UUNET now offers a "burstable" T1 line connection, whose cost is determined by the usage level under which 95 percent of traffic samples taken every five minutes occur. For example, if 95 percent of the samples in a given month fall below 128 Kbps, the site using the burstable service would be charged based upon the monthly cost for a 0 to 128 Kbps burstable usage tier.

This new approach to server connectivity can be extremely advantageous because it measures usage over the 24-hour clock and provides a full 1.544 Mbps bandwidth to satisfy future growth as well as peak workloads. Although it is extremely important to plan an appropriate Internet connection operating rate, many organizations just jumping into the arena may not have the necessary information for predicting a line operating rate with confidence. For those organizations, as well as organizations whose marketing efforts can have a dramatic effect on daily hits, the use of a burstable T1 service may be just what the doctor ordered. For other organizations that are better able to quantify their customers and forecast potential hits, the traffic estimation worksheet may be a valuable tool.

About the Author

Gilbert Held is an internationally known author and lecturer specializing in the application of technology. Gil's recent books include Internetworking LANs and WANs, Ethernet Networks 2ed., Data and Image Compression 4ed., and Protecting LAN Resources, all published by John Wiley & Sons. Gil can be reached at 235-8068@mcimail.com.