Determining a Web Server Connection Rate to the Internet
Gilbert Held
Within a relatively short period of time the World Wide
Web has risen
from obscurity to become the constant focus of attention
of Time and
Newsweek and numerous technical and trade-related publications.
Although
a large number of Fortune 500 organizations have connected
servers to
the Internet, the small and medium-sized organizations
are really
fueling its growth.
Obtaining a Web address places the old "Mom and
Pop" Corporation on
equal footing with GM, Ford, and Exxon with respect
to providing
information to Web surfers. But, because most corporations
do not have
the resources of a Fortune 500 Corporation and most,
if not all, Fortune
500 companies are very interested in minimizing their
expenditure of
resources, a common problem facing both large and small
organizations,
including businesses, universities, and government agencies,
involves
selecting an appropriate Internet connection.
The Connection Problem
A key problem associated with connecting a Web server
to the Internet
involves determining an appropriate connection method
and selecting an
operating rate for that connection method. Today, you
can select from a
dozen or more connection options, ranging from low-speed
analog-leased
lines, whose operating rates are governed by the type
of modems used, to
a variety of digital services. Concerning the latter,
many Internet
access providers support the use of leased 56 Kbps,
fractional T1, full
T1, fractional T3, and full T3 connectivity options.
In a year of
political rhetoric, this is surely a choice, not an
echo!
The problem facing network managers and administrators
concerning the
connection of a Web server to the Internet is twofold.
If you select a
connection method using a line operating rate that is
too slow, you run
the risk of alienating potential customers as they grow
frustrated
attempting to surf through pages on your Web server.
If you select a
connection method that results in a line operating rate
that exceeds the
access requirements of users, you will more than likely
waste corporate
funds by providing an excessive transmission capability.
Thus, the Web
server connection process needs a mechanism to provide
network managers
and administrators with information concerning an appropriate
Internet
connection rate. This mechanism can be obtained through
a traffic
estimation process, which is the focus of this article.
It provides a
starting point for understanding both Web server traffic
and the
constraints associated with enabling Internet users
to access your Web
server.
Although the traffic estimation process can provide
insight concerning
an applicable connection method and operating rate for
the attachment of
the corporate, government, or university Web server,
the process itself
does not examine the practicality of the connection
method. That is,
since a Web server is normally placed on a local area
network (LAN) and
the LAN is actually connected via a router to the Internet,
you must
also examine the data flow on the LAN. Doing so will
enable you to
determine whether the Web server can obtain the necessary
amount of LAN
bandwidth to enable the WAN link to be fully utilized.
If not, you might
consider LAN segmentation, moving other servers or workstations
to
another LAN segment, or restructuring your LAN using
another method. One
or more of those methods would enable the Web server
to obtain
sufficient LAN bandwidth to fully utilize the WAN operating
rate.
The Traffic Estimation Process
To begin the traffic estimation process, first, you
must review the
design of your organization's Web pages, including their
links to one
another. You will need a series of web pages to illustrate
this process,
so assume the design of the M&P Corporation Web
server resulted in the
creation of seven Web pages whose hyperlink relationships
are
illustrated in Figure 1.
As shown in Figure 1, the Internet user first obtains
a display of the
organization's home page. From the home page, the user
can select one of
three hyperlinks, resulting in the display of a second
Web page (the
first page in a two-tiered display). Note that each
page provides a link
back to the organization's home page, and the bottom
page in each
two-tiered display also provides a link back to the
top page. Thus, the
hyperlink relationship enables a user to easily jump
from one display to
another within a series of three topics or subjects.
However, the sample
design does not facilitate intertopic connectivity.
Instead, the person
must first return to the home page and select a hyperlink
associated
with a new topic or subject. This design is not representative
of a
surfer-friendly Web page, but it suffices to illustrate
the methodology
behind the estimation process.
To illustrate the traffic estimation process requires
knowledge of the
characters or bytes associated with each page and the
average number of
pages each user accessing the M&P Corporation server
will view. For
example, assume the home page, which is the only page
always accessed by
a person viewing this site, consists of two GIF images,
each requiring
95,000 bytes of storage and 800 bytes of text. Then,
a person viewing
this home page will receive 190,800 bytes of data.
For simplicity, assume that each of the pages on tier
1 contains one GIF
image requiring 75,000 bytes of storage and 1,000 bytes
of text, and
that each page in the second tier consists of 1500 bytes
of text. The
two key estimates for determining a line operating rate
are the number
of hits you expect during a busy hour and the quantity
of traffic that
must be transmitted in response to each hit.
To avoid ambiguity, I define a "hit" as a
Uniform Resource Locator (URL)
reference to a page on your Web server. I define the
"busy hour" as the
1-hour period that will have the highest number of attempted
or actual
hits. Assume that, based upon a survey of existing Web
sites providing
similar service to yours, you have determined that those
sites average
5,000 hits per day, with a daily busy hour averaging
750 hits. Although
every person accessing your Web server will initially
access your home
page, you estimate that subsequent page references will
result in a
distribution such that access to the home page will
represent 60 percent
of all hits. Tier 1 and tier 2 page accesses will represent
25 and 15
percent of all hits, respectively. Thus, the average
data transfer per
hit for the M&P Corporation Web server becomes:
(190,800 * .6) + (76,000 * .25) + (1500 * .15) = 133,705 bytes
Now that you have obtained the average number of bytes
expected to be
transferred per server hit, you can use that value in
conjunction with
the estimated number of busy hour hits to compute the
line operating
rate necessary to accommodate expected busy hour traffic.
That is, since
you expect 750 hits per busy hour, and the average transmission
response
per hit represents the transfer of 133,705 bytes, a
reasonable estimate
of the line operating rate needed becomes:
750 hits/hour * 133,705 bytes/hit * 8 bits/byte
_______________________________________________ = 222,841 bps
60 seconds/minute * 60 min/hour
For estimating purposes, you can ignore the overhead
of the transport
protocol. Although a more precise calculation could
be made, it is
easier to simply use caution if the estimated operating
rate requirement
is within 10 to 15 percent of the line rating.
Based upon the preceding, you might be tempted to call
your Internet
access provider and request the installation of a 256
Kbps fractional T1
line. However, you should examine another important
constraint that will
govern the ability of your organization to use the bandwidth
of your
Internet access connection. That constraint is the method
by which you
anticipate connecting your Web server to the Internet.
Examining LAN Utilization
Figure 2 illustrates the method by which most Web servers
are connected
to the Internet. In Figure 2, a leased line at a 256
Kbps operating
rate, which was previously determined suitable for supporting
Internet
traffic to the Web server, is shown installed between
an Internet access
provider and the M&P Corporation LAN. At the customer
site, the line is
terminated by a router, which in turn is connected to
a LAN on which the
Web server resides. The figure shows a bus-structured
Ethernet LAN. But
in actuality, any type of LAN could be used as long
as the router
supports the LAN architecture and you are able to install
a TCP/IP stack
on the server that also supports the selected LAN.
We have made several assumptions concerning the composition
of the M&P
Corporation's home page and other pages residing on
the corporate
server, which were used to compute a WAN operating rate,
but we still do
not know if that operating rate is actually achievable.
We have not yet
considered the constraints imposed on the Web server's
ability to
transmit data due to the traffic generated by other
workstations and
servers that may reside on the same LAN. The networking
constraints of
the internal network will serve as a transmission cap
for the WAN and
will govern the maximum practical operating rate of
the transmission
facility linking the Internet access provider to the
corporate Web
server.
Determining a Maximum Operating Rate
Assume that in addition to the Web server located on
the Ethernet LAN,
there are 23 workstations and another server on that
network, resulting
in a total of 25 devices sharing the transmission bandwidth
of the LAN.
This means that the Web server, on average, will obtain
1/25th of the 10
Mbps bandwidth of the LAN, or 400,000 bps. Since the
efficiency of an
Ethernet LAN severely degrades when the network utilization
level
exceeds 60 percent, this would further reduce the sustained
supportable
average bandwidth obtained by the Web server to 400,000*60%,
or 240,000
bps. Thus, the LAN itself acts as a constraint limiting
the line
operating rate from the Internet access provider to
the customer
premises to an average of 240,000 bps. Now that we have
computed an
upper limit resulting from LAN activity, we can compare
that limit to
the previously computed WAN operating rate.
Previously, we computed the necessary WAN operating
rate to support 750
hits during the busy hour to be approximately 223 Kbps
and selected a
256 Kbps fractional T1 connection to provide the necessary
WAN
connection. In examining the traffic on the LAN, we
determined that the
Web server would be capable of obtaining an average
sustainable
bandwidth of 240 Kbps. Since the average sustainable
LAN bandwidth
exceeds the required WAN operating rate necessary to
support 750 hits
during the busy hour, we do not have to consider modifying
the use of
the LAN to boost the use of Web server bandwidth.
To facilitate the selection of an appropriate Web server
WAN connection,
I developed a spreadsheet model using Lotus 123 Release
5 for Windows.
The cell contents of the spreadsheet model are listed
in Figure 3, and
the resulting spreadsheet display is shown in Figure
4.
Note that the worksheet in Figure 4 was based upon a
three-tiered Web
server page relationship that may or may not apply to
your organization.
I also assumed that there were a total of 25 stations
on the LAN on
which the server would reside, and assumed the number
of hits during the
busy hour to total 750. Thus, there are numerous assumptions
you will
need to make to select an appropriate WAN operating
rate.
Once you commence your Internet connection several Internet
access
providers can offer services to simplify the problems
associated with
making these assumptions. They have introduced a relatively
new type of
service in which users are connected via a full T1 line,
but are only
billed for the organization's sustained usage that occurs
within a
predefined operating rate tier. For example, UUNET now
offers a
"burstable" T1 line connection, whose cost
is determined by the usage
level under which 95 percent of traffic samples taken
every five minutes
occur. For example, if 95 percent of the samples in
a given month fall
below 128 Kbps, the site using the burstable service
would be charged
based upon the monthly cost for a 0 to 128 Kbps burstable
usage tier.
This new approach to server connectivity can be extremely
advantageous
because it measures usage over the 24-hour clock and
provides a full
1.544 Mbps bandwidth to satisfy future growth as well
as peak workloads.
Although it is extremely important to plan an appropriate
Internet
connection operating rate, many organizations just jumping
into the
arena may not have the necessary information for predicting
a line
operating rate with confidence. For those organizations,
as well as
organizations whose marketing efforts can have a dramatic
effect on
daily hits, the use of a burstable T1 service may be
just what the
doctor ordered. For other organizations that are better
able to quantify
their customers and forecast potential hits, the traffic
estimation
worksheet may be a valuable tool.
About the Author
Gilbert Held is an internationally known author and
lecturer
specializing in the application of technology. Gil's
recent books
include Internetworking LANs and WANs, Ethernet Networks
2ed., Data and
Image Compression 4ed., and Protecting LAN Resources,
all published by
John Wiley & Sons. Gil can be reached at 235-8068@mcimail.com.
|