Cover V05, I05
Article
Figure 1
Figure 2
Sidebar 1

may96.tar


IP Networking over ATM

David Rush

This article discusses the trials and tribulations of building and maintaining switched and permanent virtual circuits (SVCs and PVCs) as used in the MAGIC and AAI ATM networks. It is based on my experiences in maintaining and expanding one site on the MAGIC and AAI networks, and I hope provides a real-world account of the way these networks are administered, enhanced, and repaired. The article is written for administrators and users with some experience with IP networking, but little or no experience with ATM networking.

Introduction

The Multidimensional Applications and Gigabit Internetwork Consortium (MAGIC) project (http://www.magic.net) was initiated in 1992 and consists of an OC48 SONET (synchronous optical network) network running at 2.4 gigabits per second (Gbps). Most of the optical network is provided by Sprint. Other members of the consortium are Minnesota Supercomputing Center, Inc (MSCI), Minneapolis, MN; the University of Kansas (KU) at Lawrence, KS; EROS (Earth Resources Observation Station) Data Center (EDC) in Sioux Falls, SD; the Army's Battle Command Battle Lab (BCBL) at Fort Leavenworth, KS; the Lawrence Berkeley National Labs (LBNL) at Berkeley, CA; and SRI International.

The Advanced Communications Technology Satellite (ACTS) asynchronous transfer mode (ATM) Internetwork (AAI) project (http://www1.arl.mil/HPCMP/AAI/) began in 1994. AAI includes an ATM network that gets most of its connectivity from Sprint's commercial ATM service, and the remainder from MAGIC.

An ATM Primer

ATM networks are becoming an established form of digital communications. The ATM standard was developed by a multinational community to support the transport of a variety of data types to include digitized voice, digitized video, and computer network traffic. The ATM standard was designed to be scalable to very high data rates; ATM products are available today with ports that operate at data rates as high as 622 megabits per second (OC12).

ATM devices communicate by passing 53 byte cells, containing 5 bytes of header information and 48 bytes of "payload" data. As an ATM cell traverses the ATM network, each ATM switch in the network inspects the header and makes a decision about where to send that cell based on tables kept by the switch.

The focus of this article is predominantly on IP networking over ATM. IP uses ATM by chopping the IP datagrams into cells, transmitting the cells to the destination, and then reassembling them into the original datagram.

A Simple Switch

An ATM switch has multiple ports, often offering different speeds to suit varying needs. Each port is one end of a point-to-point circuit with just two devices on it (as opposed to an Ethernet bus, with many devices on it). OC3 (155 Mbps) and DS3 (45 Mbps) ports are common, and lower data rates are readily available. Cells that arrive on a port are inspected for their header information, which includes a virtual path indicator (VPI) and a virtual circuit indicator (VCI).

A switch uses the port, VPI, and VCI information to find a matching entry in a lookup table. The table will identify which port to send the cell out on, as well as new VPI and VCI information to replace the old. For example, a switch might contain a table entry that tells it to take all cells that arrive on port 7 with a VP of 53 and a VC of 218, and send them out on port 4, with a VP of 16 and a VC of 143.

The ATM standard specifies that each port can handle at least 256 virtual paths, and that each virtual path can handle up to 65536 virtual circuits. The virtual circuits can be thought of as "nested" inside their virtual paths (see Figure 1). VPI numbers can be reused on different ports and VCI numbers can be reused on different VPIs.

Limitations such as the amount of memory available to a switch can constrain the size of the tables, which can artificially limit the number of VPIs and VCIs below what the standard allows. For example, many of the switches in MAGIC and AAI by default are limited to VCIs between 0 and 256, although they can be reconfigured to support more.

Virtual Circuits

A collection of switches can be coordinated to create a "virtual circuit" between two end devices, such as workstations' network interfaces. Each switch will have one table entry for each virtual circuit that passes through it. Cells that are sent "into" the virtual circuit with the correct VPI and VCI information will traverse the network and arrive at their intended destination. ATM networks can handle a large number of independent virtual circuits, between common endpoints and different endpoints. A typical IP network over ATM will result in a "full mesh" with each workstation having two virtual channels (one "to" and one "from") associated with every other workstation.

Permanent Virtual Circuits

A permanent virtual circuit (PVC) is typically built by hand. On a workstation with an ATM interface, the IP network administrator would normally create a boot-time script that identifies which ATM interface, VPI, and VCI should be used to reach a given IP number (for the outgoing cells), and that the interface should "listen" for cells on a given interface, VPI, and VCI. Although it is not required, keeping the VPI and VCI information the same across the network, and the same for both directions of a two-way virtual circuit (which is actually two virtual circuits going in opposite directions) makes PVC-based networks a little easier to manage. Keeping that consistency gets more difficult with increasingly larger networks, however, especially across different administrative regions.

Each switch that carries PVCs needs to have each port/VPI/VCI table entry done by hand as well, although scripts can make it easier. Switches can store their tables so that they need not be rebuilt every time the switch is restarted. Because of the long setup time, requiring human intervention, PVCs are usually built and kept "up" for long periods of time.

PVCs are analogous to leased telephone lines; that is, permanently installed between two specific endpoints.

Switched Virtual Circuits

A switched virtual circuit (SVC) is created automatically, on demand. SVC use requires a signaling protocol that allows one network endpoint to initiate a "call" to another network endpoint. The switches communicate with the endpoints and each other to create a virtual circuit across the network. SVCs stay "up" only as long as necessary.

To use SVCs, an IP network administrator must install and configure SVC-capable software on the workstations and switches. Fortunately, most workstation ATM drivers and ATM switches that are suitable for LAN applications (and many that are suitable for WAN applications) come with some kind of SVC capability.

SVCs are analogous to dial-up telephone lines; however, one workstation can have many SVCs active at once.

MAGIC

The MAGIC network consists of five primary sites: the TIOC at Sprint's facility in Overland Park, KS; MSCI in Minneapolis, MN; KU in Lawrence, KS; EDC in Sioux Falls, SD; and the Army's BCBL at Fort Leavenworth, KS. At the physical layer, (see Figure 2) the MAGIC network uses a hub topology: the TIOC (the hub) has a SONET circuit to each of the other sites, with a minimum capacity of two operating OC3s to any other site. The two OC3s are multiplexed inside the OC48s that connect the outlying sites to the hub. Planning is underway to replace the two OC3s with a single OC12 to each site.

Four of the five MAGIC sites use ATM switches made by FORE Systems, Inc. These switches come equipped with SPANS, an SVC signaling scheme. Most of the workstation interfaces that connect to FORE switches are also FORE products, and support SPANS. As a result of the widespread availability of SPANS, most of MAGIC was SVC-based from the beginning. The fifth site uses Digital Equipment Corporation (DEC) ATM switches and DEC network interfaces on DEC workstations. Since the DEC and FORE equipment did not support a common signaling protocol, all of the cross-vendor virtual circuits were necessarily PVCs. Since the MAGIC network connected only five sites, with a small number of workstations at each site (typically 1-5), the PVCs were manageable, although the management was sometimes tedious.

As of this writing, MAGIC is preparing to implement UNI 3.0, a standard SVC signaling scheme that is supported on all the switches and workstation interfaces used in MAGIC. Completion of this implementation should greatly ease the PVC burden.

AAI

The Advanced Communications Technology Satellite (ACTS) ATM Internetwork (AAI) is another experimental and research network that includes three MAGIC sites and several other sites (12 sites as of this writing) scattered around the country. AAI's backbone (between sites) is on Sprint's commercial ATM service, which is often viewed as a "cloud" with every site connecting to the cloud. The cloud (actually a collection of high-capacity ATM switches that are fully meshed so that any switch can reach any other switch in one ATM hop) can be configured by Sprint's Broadband Operations Center (BBOC) to allow any site to reach one, some, or all other sites. A typical site on AAI has at least one ATM switch that connects to the cloud, to local workstations, and/or to other local switches.

When first brought online, AAI used a full mesh of PVCs (i.e., every workstation with an ATM interface on AAI had one PVC to every other workstation on AAI). Some sites have as few as two workstations, but other sites have several. This resulted in several pages of PVCs -- one PVC pair between every two hosts on AAI. Every time new hosts came online, a request was sent to Sprint's BBOC for a new mesh from the new workstation to all existing workstations, a procedure that quickly became unmanageable.

The answer was to create a full mesh -- on a site-by-site basis -- of virtual paths (VPs). Sprint's BBOC configured each site to have one VP to each of the other sites. Once the VPs were up, any VCs (PVCs or SVCs) created on the correct VP would automatically pass through the cloud to the destination site without needing further intervention from the BBOC. This revised process was a big improvement, especially when new hosts came online.

The eventual implementations of UNI 3.0 were a boon to AAI. UNI 3.0 provides SVC signaling across the cloud's VP mesh. Once fully transitioned, the only PVCs that are needed on AAI involve the few, unusual workstations that do not provide UNI 3.0 signaling. The vast majority of hosts on AAI do support UNI 3.0 signaling.

The MAGIC IP Network

At the Internet protocol (IP) layer, the MAGIC network is flat. That is, all of the MAGIC hosts (workstations and switches) are on the ATM network and are within the same class C IP network, so they can reach each other directly without going through any routers. One central coordinator allocates IP numbers as needed, which works fine with the small number of hosts on the network.

For communications to take place, the same VP must be used on the two ends of every link (either between two switches, between a switch and workstation, or even between two workstations that are directly connected). VP 0 is configured on most switches by default, so MAGIC uses that virtual path for all of its links. Also, VP 0 must be used between each workstation and the switch to which it is connected, because the workstation interfaces only support VP 0. Nonzero VPs could be used on the links between switches, but in the MAGIC network, there's no advantage to it.

The size of the MAGIC network allowed for PVCs (where needed) to use the same VP/VC numbers for the entire length of each circuit (which rarely exceeded three switches). That scheme worked fine initially, but network growth (i.e., VC crowding) is making it more and more difficult to maintain.

The AAI IP Network

Initially AAI was brought online as one class C-sized subnet of a class B IP network. This allowed AAI to quickly establish connectivity using a borrowed subnet from one of the participants.

Once underway, the single IP network topology became constraining because of AAI's growth and the rapid change of some sites' configurations. This problem was resolved by requesting 16 contiguous blocks of class C networks, one for each site.

Using the capabilities of classless Internet domain routing (CIDR; see # RFC 1519), AAI planned to "supernet" (as opposed to "subnet") the contiguous class C networks into one larger network (with a 20-bit netmask) with all hosts directly reachable to each other. The AAI sites quickly discovered that CIDR was not well supported, reverted to keeping their class C nets and creating static routes to each of the other class C networks (with a metric of 0). This resulted in 11 static routes on every machine; a hack to be sure, but it works without routers.

Multiple IP Networks on One ATM Network

There is overlap between MAGIC and AAI. There are three sites whose switches support both IP networks. One site (the TIOC) is physically attached to both nets, and provides the crossover for the other two sites. Aside from dealing with a larger number of VCs and occasional conflicting requirements between the two networks, having both networks run on a single switch works fine. In fact, several other IP networks are running on various switches on MAGIC and AAI. For example, the MAGIC hub switch and MAGIC/AAI crossover switch both support multiple IP networks as well as providing part of a link between two ATM-capable IP routers.

PVC Management

If the target hardware and networks support SVCs, that scheme will generally provide the easiest management of the ATM network; however, PVCs are sometimes necessary. When MAGIC was first brought online, PVCs were only required on the one non-FORE switch and its attached hosts. At that time the list of PVCs was centrally coordinated and any one PVC used the same VCI and VPI numbers (all VP 0, in fact) for the entire circuit. VCI numbers were kept unique throughout the network.

As AAI evolved with its initially heavy use of PVCs and multiple administrative regions, central coordination and network-wide uniqueness of PVCs became impractical. Where necessary, PVC assignment has been delegated to individual sites.

A variety of PVC management schemes has been tried, but the end result for now is the brute force method: keep a file of what's being used where, and keep that file up-to-date. When the need arises for a new PVC, a common VPI/VCI is agreed upon by the site administrators at each end, and that VPI/VCI is used between the sites' switches. Once the PVC enters a site, that site's administrator decides how to map that PVC to the workstation. My technique is to keep a list of PVCs between sites (that often use non zero VPs) and PVCs from the local switches to the local workstations, all on VP 0. The remote sites neither know nor care what VPIs and VCIs are used on my end of each PVC.

As a convenience to the other site administrators in the MAGIC and AAI networks, I keep my PVC list on a password-protected web site so that they can confirm as necessary.

Troubleshooting Techniques

Several things can go wrong with these networks. When a host for which I am responsible cannot reach another host, I have learned to check a number of possibilities. There are some IP layer tools that help, but here I will concentrate on the ATM issues.

Check Carrier Indicators

Most ATM equipment will have something to indicate whether or not a carrier is present on a physical circuit. If the carrier is not there, then nothing is going to work on that circuit. Check all the relevant cabling to make sure something is not unplugged or damaged. If your ATM circuit to the wide area is showing a loss of carrier, chances are that your provider is already aware of it, but call them anyway to report a problem.

On the FORE Systems equipment, carrier status can be checked by observing the LEDs on the switches and workstation adapters (off is good), or using utility programs on the switches and workstations (these make it easy to check things remotely).

Check PVC Mappings

If one or more workstations cannot talk over a given circuit, but others can, then the physical circuit is probably fine. If you are using PVCs, check all the switches to make sure the mappings are still there and correct. Even when using SVCs I have often found it useful to build a test PVC between two workstations to debug a circuit. Because it is a PVC, I know what the VPI and VCI numbers are, and they will stay configured until I delete them (unlike SVCs that can come and go automatically).

Check Cell Counters

Once the PVC mappings on the switches and end workstations are confirmed to be intact, start a continuous ping on one of the workstations. Using the utility programs supplied with the workstation ATM adapters, check to make sure that the outgoing cell count on the pinging workstation increases each time you check it. If not, check that your workstation's ATM interface and drivers are properly installed, and are properly configured at the ATM and IP layers.

Once your workstation is generating cells, go to each switch in the circuit, starting at the switch nearest the pinging workstation, and view the cell counters on the PVC that you are using. As soon as you find cell counters that are not rising, you have isolated where the problem is.

If the cell counters from the pinging workstation through the switches all look good, then check the pinged workstation for the incoming cells of the ping request. Then check the outgoing cell counters to make sure the workstation is generating a ping reply. Finally, check the switches on the return path from the pinged to the pinging workstations.

Gotchas

There are a couple of things that I have learned (the hard way) about DS3, SONET, and other circuits. The first is timing. The timing attribute applies to the ports on switches, and concerns whether the switch generates the clocking that the port uses on its circuit, or whether it uses a clock from the device at the other end of the circuit. Normally a circuit that comes from a provider (the phone company) will provide the timing, so set that port on your switch to "external" or "network" timing. Other settings of timing will depend on specific installations.

The second issue that has caused some head-scratching is scrambling. Scrambling in this case is done for timing reasons, not for security. I will not go into the details of scrambling, except to say that both ends of a circuit should have their switch ports set the same -- either on or off. Most DS3 and SONET circuits in MAGIC and AAI have scrambling set to "on."

Different scrambling settings on either end of a circuit can cause cell counter checks to look okay, but prevent the receiving workstation from properly decoding cells.

Bandwidth

An appetite for network bandwidth is often a reason for going to ATM networking, and ATM can provide it. However, it is important to realize that all parts of the network need to be up to the job, including the workstations and their operating systems.

MAGIC and AAI participants have demonstrated repeatedly that while the network can support high data rates, the workstations often have trouble generating traffic near the capacity of the circuits. The classic problem is one of relatively small TCP windows. With fast ATM circuits (such as OC3), the ratio of data rate to latency goes up. Consequently, a protocol that requires a receiving workstation to acknowledge (ACK) data (such as TCP) can cause the sending workstation to spend a disproportionately high amount of time waiting for ACKs rather than actually sending data. Fortunately, several workstation operating systems are offering "large window" options for their TCP implementations, allowing for larger amounts of data to be sent before waiting for an ACK.

My experiences with MAGIC and AAI prove that ATM networks work, and work well for IP, but are not without their own new set of issues.

About the Author

David Rush is a software engineer for SRI International, working under contract at Sprint's TIOC. He can be reached at rush@erg.sri.com.