IP Networking over ATM
David Rush
This article discusses the trials and tribulations of
building and
maintaining switched and permanent virtual circuits
(SVCs and PVCs) as
used in the MAGIC and AAI ATM networks. It is based
on my experiences in
maintaining and expanding one site on the MAGIC and
AAI networks, and I
hope provides a real-world account of the way these
networks are
administered, enhanced, and repaired. The article is
written for
administrators and users with some experience with IP
networking, but
little or no experience with ATM networking.
Introduction
The Multidimensional Applications and Gigabit Internetwork
Consortium
(MAGIC) project (http://www.magic.net) was initiated
in 1992 and
consists of an OC48 SONET (synchronous optical network)
network running
at 2.4 gigabits per second (Gbps). Most of the optical
network is
provided by Sprint. Other members of the consortium
are Minnesota
Supercomputing Center, Inc (MSCI), Minneapolis, MN;
the University of
Kansas (KU) at Lawrence, KS; EROS (Earth Resources Observation
Station)
Data Center (EDC) in Sioux Falls, SD; the Army's Battle
Command Battle
Lab (BCBL) at Fort Leavenworth, KS; the Lawrence Berkeley
National Labs
(LBNL) at Berkeley, CA; and SRI International.
The Advanced Communications Technology Satellite (ACTS)
asynchronous
transfer mode (ATM) Internetwork (AAI) project
(http://www1.arl.mil/HPCMP/AAI/) began in 1994. AAI
includes an ATM
network that gets most of its connectivity from Sprint's
commercial ATM
service, and the remainder from MAGIC.
An ATM Primer
ATM networks are becoming an established form of digital
communications.
The ATM standard was developed by a multinational community
to support
the transport of a variety of data types to include
digitized voice,
digitized video, and computer network traffic. The ATM
standard was
designed to be scalable to very high data rates; ATM
products are
available today with ports that operate at data rates
as high as 622
megabits per second (OC12).
ATM devices communicate by passing 53 byte cells, containing
5 bytes of
header information and 48 bytes of "payload"
data. As an ATM cell
traverses the ATM network, each ATM switch in the network
inspects the
header and makes a decision about where to send that
cell based on
tables kept by the switch.
The focus of this article is predominantly on IP networking
over ATM. IP
uses ATM by chopping the IP datagrams into cells, transmitting
the cells
to the destination, and then reassembling them into
the original
datagram.
A Simple Switch
An ATM switch has multiple ports, often offering different
speeds to
suit varying needs. Each port is one end of a point-to-point
circuit
with just two devices on it (as opposed to an Ethernet
bus, with many
devices on it). OC3 (155 Mbps) and DS3 (45 Mbps) ports
are common, and
lower data rates are readily available. Cells that arrive
on a port are
inspected for their header information, which includes
a virtual path
indicator (VPI) and a virtual circuit indicator (VCI).
A switch uses the port, VPI, and VCI information to
find a matching
entry in a lookup table. The table will identify which
port to send the
cell out on, as well as new VPI and VCI information
to replace the old.
For example, a switch might contain a table entry that
tells it to take
all cells that arrive on port 7 with a VP of 53 and
a VC of 218, and
send them out on port 4, with a VP of 16 and a VC of
143.
The ATM standard specifies that each port can handle
at least 256
virtual paths, and that each virtual path can handle
up to 65536 virtual
circuits. The virtual circuits can be thought of as
"nested" inside
their virtual paths (see Figure 1). VPI numbers can
be reused on
different ports and VCI numbers can be reused on different
VPIs.
Limitations such as the amount of memory available to
a switch can
constrain the size of the tables, which can artificially
limit the
number of VPIs and VCIs below what the standard allows.
For example,
many of the switches in MAGIC and AAI by default are
limited to VCIs
between 0 and 256, although they can be reconfigured
to support more.
Virtual Circuits
A collection of switches can be coordinated to create
a "virtual
circuit" between two end devices, such as workstations'
network
interfaces. Each switch will have one table entry for
each virtual
circuit that passes through it. Cells that are sent
"into" the virtual
circuit with the correct VPI and VCI information will
traverse the
network and arrive at their intended destination. ATM
networks can
handle a large number of independent virtual circuits,
between common
endpoints and different endpoints. A typical IP network
over ATM will
result in a "full mesh" with each workstation
having two virtual
channels (one "to" and one "from")
associated with every other
workstation.
Permanent Virtual Circuits
A permanent virtual circuit (PVC) is typically built
by hand. On a
workstation with an ATM interface, the IP network administrator
would
normally create a boot-time script that identifies which
ATM interface,
VPI, and VCI should be used to reach a given IP number
(for the outgoing
cells), and that the interface should "listen"
for cells on a given
interface, VPI, and VCI. Although it is not required,
keeping the VPI
and VCI information the same across the network, and
the same for both
directions of a two-way virtual circuit (which is actually
two virtual
circuits going in opposite directions) makes PVC-based
networks a little
easier to manage. Keeping that consistency gets more
difficult with
increasingly larger networks, however, especially across
different
administrative regions.
Each switch that carries PVCs needs to have each port/VPI/VCI
table
entry done by hand as well, although scripts can make
it easier.
Switches can store their tables so that they need not
be rebuilt every
time the switch is restarted. Because of the long setup
time, requiring
human intervention, PVCs are usually built and kept
"up" for long
periods of time.
PVCs are analogous to leased telephone lines; that is,
permanently
installed between two specific endpoints.
Switched Virtual Circuits
A switched virtual circuit (SVC) is created automatically,
on demand.
SVC use requires a signaling protocol that allows one
network endpoint
to initiate a "call" to another network endpoint.
The switches
communicate with the endpoints and each other to create
a virtual
circuit across the network. SVCs stay "up"
only as long as necessary.
To use SVCs, an IP network administrator must install
and configure
SVC-capable software on the workstations and switches.
Fortunately, most
workstation ATM drivers and ATM switches that are suitable
for LAN
applications (and many that are suitable for WAN applications)
come with
some kind of SVC capability.
SVCs are analogous to dial-up telephone lines; however,
one workstation
can have many SVCs active at once.
MAGIC
The MAGIC network consists of five primary sites: the
TIOC at Sprint's
facility in Overland Park, KS; MSCI in Minneapolis,
MN; KU in Lawrence,
KS; EDC in Sioux Falls, SD; and the Army's BCBL at Fort
Leavenworth, KS.
At the physical layer, (see Figure 2) the MAGIC network
uses a hub
topology: the TIOC (the hub) has a SONET circuit to
each of the other
sites, with a minimum capacity of two operating OC3s
to any other site.
The two OC3s are multiplexed inside the OC48s that connect
the outlying
sites to the hub. Planning is underway to replace the
two OC3s with a
single OC12 to each site.
Four of the five MAGIC sites use ATM switches made by
FORE Systems, Inc.
These switches come equipped with SPANS, an SVC signaling
scheme. Most
of the workstation interfaces that connect to FORE switches
are also
FORE products, and support SPANS. As a result of the
widespread
availability of SPANS, most of MAGIC was SVC-based from
the beginning.
The fifth site uses Digital Equipment Corporation (DEC)
ATM switches and
DEC network interfaces on DEC workstations. Since the
DEC and FORE
equipment did not support a common signaling protocol,
all of the
cross-vendor virtual circuits were necessarily PVCs.
Since the MAGIC
network connected only five sites, with a small number
of workstations
at each site (typically 1-5), the PVCs were manageable,
although the
management was sometimes tedious.
As of this writing, MAGIC is preparing to implement
UNI 3.0, a standard
SVC signaling scheme that is supported on all the switches
and
workstation interfaces used in MAGIC. Completion of
this implementation
should greatly ease the PVC burden.
AAI
The Advanced Communications Technology Satellite (ACTS)
ATM Internetwork
(AAI) is another experimental and research network that
includes three
MAGIC sites and several other sites (12 sites as of
this writing)
scattered around the country. AAI's backbone (between
sites) is on
Sprint's commercial ATM service, which is often viewed
as a "cloud" with
every site connecting to the cloud. The cloud (actually
a collection of
high-capacity ATM switches that are fully meshed so
that any switch can
reach any other switch in one ATM hop) can be configured
by Sprint's
Broadband Operations Center (BBOC) to allow any site
to reach one, some,
or all other sites. A typical site on AAI has at least
one ATM switch
that connects to the cloud, to local workstations, and/or
to other local
switches.
When first brought online, AAI used a full mesh of PVCs
(i.e., every
workstation with an ATM interface on AAI had one PVC
to every other
workstation on AAI). Some sites have as few as two workstations,
but
other sites have several. This resulted in several pages
of PVCs -- one
PVC pair between every two hosts on AAI. Every time
new hosts came
online, a request was sent to Sprint's BBOC for a new
mesh from the new
workstation to all existing workstations, a procedure
that quickly
became unmanageable.
The answer was to create a full mesh -- on a site-by-site
basis -- of
virtual paths (VPs). Sprint's BBOC configured each site
to have one VP
to each of the other sites. Once the VPs were up, any
VCs (PVCs or SVCs)
created on the correct VP would automatically pass through
the cloud to
the destination site without needing further intervention
from the BBOC.
This revised process was a big improvement, especially
when new hosts
came online.
The eventual implementations of UNI 3.0 were a boon
to AAI. UNI 3.0
provides SVC signaling across the cloud's VP mesh. Once
fully
transitioned, the only PVCs that are needed on AAI involve
the few,
unusual workstations that do not provide UNI 3.0 signaling.
The vast
majority of hosts on AAI do support UNI 3.0 signaling.
The MAGIC IP Network
At the Internet protocol (IP) layer, the MAGIC network
is flat. That is,
all of the MAGIC hosts (workstations and switches) are
on the ATM
network and are within the same class C IP network,
so they can reach
each other directly without going through any routers.
One central
coordinator allocates IP numbers as needed, which works
fine with the
small number of hosts on the network.
For communications to take place, the same VP must be
used on the two
ends of every link (either between two switches, between
a switch and
workstation, or even between two workstations that are
directly
connected). VP 0 is configured on most switches by default,
so MAGIC
uses that virtual path for all of its links. Also, VP
0 must be used
between each workstation and the switch to which it
is connected,
because the workstation interfaces only support VP 0.
Nonzero VPs could
be used on the links between switches, but in the MAGIC
network, there's
no advantage to it.
The size of the MAGIC network allowed for PVCs (where
needed) to use the
same VP/VC numbers for the entire length of each circuit
(which rarely
exceeded three switches). That scheme worked fine initially,
but network
growth (i.e., VC crowding) is making it more and more
difficult to
maintain.
The AAI IP Network
Initially AAI was brought online as one class C-sized
subnet of a class
B IP network. This allowed AAI to quickly establish
connectivity using a
borrowed subnet from one of the participants.
Once underway, the single IP network topology became
constraining
because of AAI's growth and the rapid change of some
sites'
configurations. This problem was resolved by requesting
16 contiguous
blocks of class C networks, one for each site.
Using the capabilities of classless Internet domain
routing (CIDR; see #
RFC 1519), AAI planned to "supernet" (as opposed
to "subnet") the
contiguous class C networks into one larger network
(with a 20-bit
netmask) with all hosts directly reachable to each other.
The AAI sites
quickly discovered that CIDR was not well supported,
reverted to keeping
their class C nets and creating static routes to each
of the other class
C networks (with a metric of 0). This resulted in 11
static routes on
every machine; a hack to be sure, but it works without
routers.
Multiple IP Networks on One ATM Network
There is overlap between MAGIC and AAI. There are three
sites whose
switches support both IP networks. One site (the TIOC)
is physically
attached to both nets, and provides the crossover for
the other two
sites. Aside from dealing with a larger number of VCs
and occasional
conflicting requirements between the two networks, having
both networks
run on a single switch works fine. In fact, several
other IP networks
are running on various switches on MAGIC and AAI. For
example, the MAGIC
hub switch and MAGIC/AAI crossover switch both support
multiple IP
networks as well as providing part of a link between
two ATM-capable IP
routers.
PVC Management
If the target hardware and networks support SVCs, that
scheme will
generally provide the easiest management of the ATM
network; however,
PVCs are sometimes necessary. When MAGIC was first brought
online, PVCs
were only required on the one non-FORE switch and its
attached hosts. At
that time the list of PVCs was centrally coordinated
and any one PVC
used the same VCI and VPI numbers (all VP 0, in fact)
for the entire
circuit. VCI numbers were kept unique throughout the
network.
As AAI evolved with its initially heavy use of PVCs
and multiple
administrative regions, central coordination and network-wide
uniqueness
of PVCs became impractical. Where necessary, PVC assignment
has been
delegated to individual sites.
A variety of PVC management schemes has been tried,
but the end result
for now is the brute force method: keep a file of what's
being used
where, and keep that file up-to-date. When the need
arises for a new
PVC, a common VPI/VCI is agreed upon by the site administrators
at each
end, and that VPI/VCI is used between the sites' switches.
Once the PVC
enters a site, that site's administrator decides how
to map that PVC to
the workstation. My technique is to keep a list of PVCs
between sites
(that often use non zero VPs) and PVCs from the local
switches to the
local workstations, all on VP 0. The remote sites neither
know nor care
what VPIs and VCIs are used on my end of each PVC.
As a convenience to the other site administrators in
the MAGIC and AAI
networks, I keep my PVC list on a password-protected
web site so that
they can confirm as necessary.
Troubleshooting Techniques
Several things can go wrong with these networks. When
a host for which I
am responsible cannot reach another host, I have learned
to check a
number of possibilities. There are some IP layer tools
that help, but
here I will concentrate on the ATM issues.
Check Carrier Indicators
Most ATM equipment will have something to indicate whether
or not a
carrier is present on a physical circuit. If the carrier
is not there,
then nothing is going to work on that circuit. Check
all the relevant
cabling to make sure something is not unplugged or damaged.
If your ATM
circuit to the wide area is showing a loss of carrier,
chances are that
your provider is already aware of it, but call them
anyway to report a
problem.
On the FORE Systems equipment, carrier status can be
checked by
observing the LEDs on the switches and workstation adapters
(off is
good), or using utility programs on the switches and
workstations (these
make it easy to check things remotely).
Check PVC Mappings
If one or more workstations cannot talk over a given
circuit, but others
can, then the physical circuit is probably fine. If
you are using PVCs,
check all the switches to make sure the mappings are
still there and
correct. Even when using SVCs I have often found it
useful to build a
test PVC between two workstations to debug a circuit.
Because it is a
PVC, I know what the VPI and VCI numbers are, and they
will stay
configured until I delete them (unlike SVCs that can
come and go
automatically).
Check Cell Counters
Once the PVC mappings on the switches and end workstations
are confirmed
to be intact, start a continuous ping on one of the
workstations. Using
the utility programs supplied with the workstation ATM
adapters, check
to make sure that the outgoing cell count on the pinging
workstation
increases each time you check it. If not, check that
your workstation's
ATM interface and drivers are properly installed, and
are properly
configured at the ATM and IP layers.
Once your workstation is generating cells, go to each
switch in the
circuit, starting at the switch nearest the pinging
workstation, and
view the cell counters on the PVC that you are using.
As soon as you
find cell counters that are not rising, you have isolated
where the
problem is.
If the cell counters from the pinging workstation through
the switches
all look good, then check the pinged workstation for
the incoming cells
of the ping request. Then check the outgoing cell counters
to make sure
the workstation is generating a ping reply. Finally,
check the switches
on the return path from the pinged to the pinging workstations.
Gotchas
There are a couple of things that I have learned (the
hard way) about
DS3, SONET, and other circuits. The first is timing.
The timing
attribute applies to the ports on switches, and concerns
whether the
switch generates the clocking that the port uses on
its circuit, or
whether it uses a clock from the device at the other
end of the circuit.
Normally a circuit that comes from a provider (the phone
company) will
provide the timing, so set that port on your switch
to "external" or
"network" timing. Other settings of timing
will depend on specific
installations.
The second issue that has caused some head-scratching
is scrambling.
Scrambling in this case is done for timing reasons,
not for security. I
will not go into the details of scrambling, except to
say that both ends
of a circuit should have their switch ports set the
same -- either on or
off. Most DS3 and SONET circuits in MAGIC and AAI have
scrambling set to
"on."
Different scrambling settings on either end of a circuit
can cause cell
counter checks to look okay, but prevent the receiving
workstation from
properly decoding cells.
Bandwidth
An appetite for network bandwidth is often a reason
for going to ATM
networking, and ATM can provide it. However, it is important
to realize
that all parts of the network need to be up to the job,
including the
workstations and their operating systems.
MAGIC and AAI participants have demonstrated repeatedly
that while the
network can support high data rates, the workstations
often have trouble
generating traffic near the capacity of the circuits.
The classic
problem is one of relatively small TCP windows. With
fast ATM circuits
(such as OC3), the ratio of data rate to latency goes
up. Consequently,
a protocol that requires a receiving workstation to
acknowledge (ACK)
data (such as TCP) can cause the sending workstation
to spend a
disproportionately high amount of time waiting for ACKs
rather than
actually sending data. Fortunately, several workstation
operating
systems are offering "large window" options
for their TCP
implementations, allowing for larger amounts of data
to be sent before
waiting for an ACK.
My experiences with MAGIC and AAI prove that ATM networks
work, and work
well for IP, but are not without their own new set of
issues.
About the Author
David Rush is a software engineer for SRI International,
working under
contract at Sprint's TIOC. He can be reached at rush@erg.sri.com.
|