Cover V09, I09
Figure 1
Figure 2


Quality of Service

Gilbert Held

The new millennium represents a network-based millennium. When we shop, our purchases are automatically charged to our accounts, and then they are deleted from the inventory control system, which facilitates the reordering of merchandise. At home, we can purchase stocks and pay bills, as well as sign up for and take college courses using our computers. By connecting and installing a sound card and connecting a microphone in our computer, it becomes possible to use the computer as a telephone. The addition of a miniature camera can provide a videoconferencing capability, making the PC a ubiquitous device. However, while the number of potential network-based applications is only limited by our imaginations, the characteristics of each application can differ.

If you pick up a trade publication, it is difficult not to encounter the term Quality of Service (QoS). This term references the ability of a network to provide a set of characteristics that tailors the delivery of data to user requirements and is the focus of this article. Actually, this article is the first in a short series that will cover various aspects and components associated with QoS. I will define QoS, and, in doing so, note the metrics by which QoS can be specified, why some metrics are more suitable than others for different types of applications, and examine various networking architecture components that support QoS.

Because this was written as an introductory article, later ones will go into more detail concerning the tools and techniques available to obtain different levels of QoS as data flows through local and wide area networks. In subsequent articles, I will detail QoS techniques associated with ingress and egress network locations, as well as the flow of data through a WAN network and the techniques that can provide different QoS capabilities. Because there is truly no free lunch in networking, I will also discuss one of the key issues that may hamper the ability to obtain QoS on an inter-network basis: charging for the service.


When we pick up a telephone and call a distant party, we obtain a quality of service that makes a voice conversation both possible and practical. The practicality of the call results from the basic design of the telephone company network infrastructure. That infrastructure digitizes voice conversations into a 64-Kbps data stream and routes the digitized conversation through a fixed path, established over the network infrastructure. For the entire path, 64-Kbps of bandwidth is allocated on an end-to-end basis to the call. The fixed path is established through the process referred to as circuit switching.

Under circuit switching, a 64-Kbps time slot is allocated from the entry (ingress) point into the telephone network, through the network, to an exit (egress) point. The 64-Kbps time slot is commonly referred to as a Digital Signal (DS) level 0 (DS0), and the path through which the DSO signal is allocated occurs by switches reserving 64-Kbps slices of bandwidth.

As voice is digitized at the ingress point into the telephone company network, a slight delay of a few milliseconds occurs. As each switch performs a cross connection operation, permitting digitized voice to flow from a DSO contained in one circuit connected to the switch, onto a DSO channel on another circuit connected to the switch, a path is formed through the network and another delay occurs. Although each cross-connection introduces a slight delay to the flow of digitized voice, the switch delay is minimal, typically a fraction of a millisecond or less. Thus, the total end-to-end delay experienced by digitized voice as it flows through a telephone network is minimal.

Another characteristic of the flow of digitized voice through the telephone network infrastructure concerns the variability or latency differences between each digitized voice sample. Although voice digitization and circuit switching processes add latency to each voice sample, that delay is uniform. Thus, we can characterize the telephone network as a low delay, uniform, or near uniform delay transmission system. Those two qualities — low delay and uniform, or near uniform delay — represent two key Quality of Service metrics. The two metrics are important considerations for obtaining the ability to transmit real-time data, such as voice and video. However, the telephone company infrastructure also provides a third key QoS metric, which is equally important. That metric is a uniform, dedicated, 64-Kbps bandwidth allocated to each voice conversation. Because that bandwidth is dedicated on an end-to-end basis, you can view it as being similar to providing an expressway that allows a stream of cars to travel from one location to another, while prohibiting other cars destined to other locations to share the highway.

A fourth QoS characteristic provided by the telephone company infrastructure is the fact that digitized voice flows end-to-end, essentially lossless. That is, there is no planned dropping of voice samples during periods of traffic congestion. Instead, when the volume of calls exceeds the capacity of the network, such as on Mother’s Day or Christmas Eve, new calls are temporarily blocked and the subscriber encounters a “fast” busy signal when dialing. The table below provides a summary of commonly used QoS metrics and their normal method of representation:

Metric                     Normal Representation
Dedicated bandwidth        bps, Kbps, or Mbps
Latency (delay)            msec
Variation (jitter)         msec
Data Loss                  percent of frames or
                           packets transmitted
Although the telephone company network infrastructure provides the QoS necessary to support real-time communications, its design is relatively inefficient. This inefficiency results from the fact that unless humans shout at one another, a conversation is normally half duplex; this results in half of the bandwidth utilization being wasted. Additionally, we periodically pause as we converse. Because 64-Kbps of bandwidth is allocated for the duration of the call, this means that the utilization of bandwidth is far from being optimized.

Packet Network

In comparison to a circuit-switched network where the use of bandwidth is dedicated to a user, packet networks allow multiple users to share network bandwidth. While this increases the efficiency level of network utilization, it introduces several new problems. To obtain an appreciation for those problems, I will illustrate the operation of a generic packet network; this could be a TCP/IP network such as the Internet, a corporate intranet, or even a Frame-Relay network.

Figure 1 shows the flow of data from two different locations over a common backbone packet network infrastructure. In this example, two organizations, labeled 1 and 2, share access via packet network node A to the packet network. Assume that packets destined from organization 1 flow to the network address Z, connected to node E, while packets from organization 2 flow to location Y, also connected to packet network node E.

In Figure 1, packets could flow over different backbone routes; however, their ingress and egress locations are shown to be in common. Assuming that location 2 is transmitting real-time information to location Y, what happens when a packet from location 1 periodically arrives at node A ahead of the packet from location 2? When this situation occurs, the packet from location 1 delays the processing of the packet arriving from location 2.

Suppose data sources connected to nodes B, C, and D all require access to devices connected to node A. When this situation arises, the device at node A may be literally swamped with packets beyond its processing capability. In this situation, the network device at node A may be forced to drop packets. While applications such as a file-transfer could simply retransmit a dropped packet without a user noticing this situation, if real-time data such as voice or video was being transmitted, too many packet drops would become noticeable; they cannot be compensated for by retransmission that further delays real-time information.

Consider what happens when the packet from location 2 is serviced at node A. If other packets require routing to node F, packets from location 2 could be further delayed. After packets from location 2 are forwarded onto the circuit between nodes A and F, they will be processed by node F. At this location, packets arriving from nodes C and E could delay the ability of node F to forward packets destined to node E. Next, packets are forwarded on to node E for delivery to address Y. For the previously described data flow, several variable delays will be introduced, each adversely effecting the flow of packets from location 2 to address Y. Additionally, once a packet reaches node E, it could be delayed by the need to process other packets. Examples include the one arriving from location 1 and destined to address Z.

Another characteristic of a packet network is that when a node becomes overloaded, it will send some packets to the great bit bucket in the sky. This is a normal characteristic of packet networks and, in fact, a frame relay performance metric involves the packet discard rate. Based upon the preceding examination of data-flow in a packet network, note that there is no guarantee that packets will arrive at all or arrive with minimal delay, nor with a set amount of variation between packets. Because this situation makes it difficult if not impossible to transport real-time data over a packet network, various techniques were developed to provide a QoS capability to packet networks. Those techniques can be categorized into three general areas. Those areas include expediting traffic at the ingress point into the network, expediting traffic through the network, and expediting delivery of traffic at the destination or egress point in the network. For each area, there are several techniques being supported by different hardware and software vendors to provide QoS capability. Some techniques are standardized, while others will be standardized in the near future.

Although standards are important, all vendors do not support all standards. Some standards are currently impractical to implement on a large scale, such as on the Internet. The manner by which an organization connects equipment to the ingress and egress points on a packet network will have a bearing upon whether or not additional QoS tools and techniques are required; they provide an end-to-end transmission capability within certain limits for delay, jitter, and obtainable bandwidth. Keeping this in mind, attention is turned to the ingress point of a packet network. Next, I’ll illustrate the flow of traffic from a LAN to the ingress point. For lack of a better term, I will call this the LAN egress location.

LAN Egress

There are several methods by which a local area network can be connected to a packet network. Although it is common to connect a LAN to a packet network via the use of a router, there are numerous network configurations that can reside behind the router that represent the structure of the corporate LAN, or even an intranet.

Because we are concerned with QoS, imagine a network configuration that allows traffic from several LAN and non-LAN-based sources to be differentiated from one another as the data is passed to a router. The key to this capability is the IEEE 802.1p standard.

The IEEE 802.1p standard represents a Layer 2 (data link layer) signaling technique, which permits network traffic to be prioritized. This standard is implemented by relatively recently manufactured Layer 2-compliant switches and routers, which can classify traffic into eight levels of priority. This means that through the use of IEEE 802.1p-compliant equipment, it becomes possible for time-critical applications on a LAN to receive preferential treatment over non-time-critical applications. It is important to note that the IEEE 802.1p standard is a Layer 2 standard. This means that a priority tag that is added to LAN frames to differentiate traffic is removed at the Layer 2 to Layer 3 conversion point, when LAN frames are converted to packets for transmission over a WAN. Thus, the IEEE 802.1p standard is only applicable to expedite traffic on a LAN.

Figure 2 illustrates an example of the use of the IEEE 802.1p standard. In this example, a PBX is shown connected to a voice gateway, which, in turn, is connected to a port on a Layer 2 LAN switch. Other switch connections include support for several LAN and server connections, as well as a connection to a router, with the latter providing connectivity to a packet network.

A non-IEEE 802.1p-compliant switch treats all flow requests equally. If a user on one LAN required access to the router at approximately the same time as a user on another LAN, the second request received would be queued behind the first. This is the classic first-in, first-out (FIFO) queuing method.

Under the IEEE 802.1p standard, traffic can be placed into different queues, based upon their level of priority. Thus, voice calls digitized by the voice gateway could be assigned a high level of priority, enabling the switch to service frames generated by the voice gateway and destined to the router, prior to frames originated from other connections destined to the router.

Router Egress

Although only one connection from the Layer 2 switch to the router is shown in Figure 2, a router can actually have several LAN-side and WAN-side connections. Because a heavily utilized router can add a significant delay to real-time traffic, router manufacturers added several techniques to expedite the flow of traffic through their products. Some of those techniques represent compliance with industry standards, while other techniques are proprietary features incorporated into their products.

Router Queuing Methods

There are several queuing methods beyond FIFO queuing supported by router manufacturers. Some of those additional queuing methods include priority queuing, custom queuing, and weighted fair queuing.

Priority queuing permits users to prioritize traffic based upon network protocol, incoming interface, packet size and source, and destination address. Because digitized voice traffic is transported in relatively short packets, you could use priority queuing to expedite digitized voice ahead of file transfers and other types of traffic carried in relatively long packets.

Under custom queuing, you can share bandwidth among applications. Thus, through its use, you could ensure that a voice or video application obtains a guaranteed portion of bandwidth at an entry point into a WAN. Under weighted fair-queuing, interactive traffic is assigned to the front of a queue to reduce response time; the remaining bandwidth is shared among high-bandwidth traffic flow.

Although the previously mentioned router-queuing methods are proprietary, there are three traffic-expediting methods that are standardized and involve the use of router queues. Because each method expedites the flow of data into or through a WAN, I’ll focus attention on those techniques.

Traffic Expediting

Within the IP header is a field labeled Type of Service, which is also referred to as the Service Type Field. This 8-bit field was intended to allow applications to indicate the type of routing-path they would like, such as low delay, high throughput, and high reliability for a real-time application. Although a great idea, this field is rarely used and is usually set to a value of 0.

A second traffic-expediting method recognizes the limited use of the Service Type Field and both renamed and reassigned values to the field. This method reuses the Service Type Field as a DiffServ (Differentiated Services) field. Under DiffServ, traffic definitions are assigned to denote the manner by which data flows are handled by routers. Presently, Assured Service and Preferred Service (each with slightly different definitions of service) have been defined by the Internet Engineering Task Force (IETF).

A third traffic-expediting method involves mapping a flow of traffic between two locations and adding a label to packets to expedite their routing. The goal behind this label method is to avoid searching through each router’s address table to find the relevant port to output a packet. Because a router could have thousands of entries in its address tables, the ability to bypass the address search expedites the flow of traffic through the router. This technique was originally referred to as tag-switching by Cisco. It was standardized by the IETF as Multi-protocol Label Switching (MPLS).

Link Efficiency

Because any technique that reduces the number of bytes of traffic increases the efficiency of a WAN, several compression methods are now standardized. Two of these techniques include TCP header compression and Real-Time Transport Protocol (RTP) header compression. RTP represents a standard for time stamping packets, which allows a receiving device to use a jitter buffer to remove timing discrepancies or variations between packets. While RTP is a key for ensuring that digitized voice is reconstructed in an acceptable manner conducive for a listener, it has a 40-byte header, which is relatively high in proportion to the typical 40 to 120-byte payload of an IP datagram. Thus, both TCP header and RTP header compression can be used to reduce the volume of traffic that is required to flow through a WAN.

Reserving Bandwidth

In our discussion concerning the operation of the telephone company circuit switching network, we noted that the key to its QoS capability is because 64 Kbps of bandwidth is allocated to each voice call for the duration of the call. In the wonderful world of IP, the Resource Reservation Protocol (RSVP) was standardized by the IETF to provide QoS by allowing applications to dynamically reserve network bandwidth. While RSVP can be used on small to medium size intranets, it does not scale to a network the size of the Internet. Additionally, as bandwidth allocations cross ISP boundaries, there is no present method to bill for the allocation of bandwidth. Although some ISP’s may support RSVP within their network portion of the Internet, it is doubtful if the use of RSVP will cross ISP boundaries in the foreseeable future.

WAN Egress

Once traffic flows through a packet network, it must be delivered to its recipient. For real-time traffic that requires a high QoS capability, the flow of packets from the packet network egress point to the recipient must be expedited. To do so, routers can use different types of priority queuing and the destination LAN can employ IEEE 802.1p priority switching. Thus, both ingress and egress to and from the packet network can involve several methods to expedite the flow of Layer 2 traffic.


Unlike the telephone network where the use of 64 Kbps of bandwidth from source to destination provides a quality of service, the use of a packet network can require numerous techniques to minimize delay and jitter. In future articles, I will focus on a more detailed illustration of those techniques, examining the IEEE 802.1p standard, the manner by which different router queuing methods operate, and other techniques that are part of the packet QoS puzzle. n

About the Author

Gilbert Held is an award-winning lecturer and author. Gil is the author of over 40 books and 250 technical articles. Some of Gil’s most recent publications include Voice and Data Internetworking 2ed. and Cisco Security Architecture (co-authored with Kent Hundley), both published by McGraw Hill Book Company, and LAN Performance 3ed. and Internetworking LANs and WANs 2ed., published by John Wiley & Sons. Gil can be reached via email at: