Article

Checking Network Health

Gilbert Held

Network administration is an interesting vocation. Just when users appear to be pleased with the operational status of the network, and the icons on the corporate network management console are all a lovely shade of green, unexpected problems typically occur. As icons turn from green to yellow or anxiety-causing red, buzzers go off, the telephone rings, and it becomes obvious that one or more network-related problems has occurred. Fortunately, most operating systems, including the flavors of UNIX and many versions of Windows, contain built-in TCP/IP applications that can provide significant insight into the health of a network. This article examines several basic network-diagnosis tools that may be unfamiliar to the novice network or system administrator. Old administrative hands may find the review useful, as well.

Is It Up?

One of the most common network-related tasks is determining whether a destination location is operational. Sometimes users will call the network administrator or operations center to report an inability to access a particular destination address. In the world of Web browsing, the user will typically report the display of a dialog box similar to the one shown in Figure 1, if they are using Netscape and are attempting to access the Web server in the domain lesmiserable.com. This dialog box can be generated by using an incorrect address, the server could be down for maintenance, or traffic on the Internet could result in address resolution timeouts.

Although the typical solution to the network problem shown in Figure 1 is to check the Uniform Resource Locator (URL) name and try again, suppose that action fails. One tool you can use to check the status of a destination host is the ping application bundled with TCP/IP software.

ping

ping, which stands for Packet Internetwork Groper, invokes a series of Internet Control Message Protocol (ICMP) Echo messages to determine if a remote device is active. The remote device when pinged will respond with an Echo Reply, enabling the round trip delay to be determined between the originator and destination devices. Thus, the use of ping can normally answer two important questions - is the destination up, and if so, how long does it take to get there? The first question can eliminate subsequent testing if no response is received. The second question can shed light on why Voice over IP or another delay-sensitive application does not appear to be working correctly. Although the implementation of ping can vary between operating systems, most support a core set of options. A common form of the ping command indicating some popular options is shown below:

Ping [-q] [-v] [-t] [-n count] [-s size] Host

where:

-q Quiet output with no display except summary lines at startup and completion.

-v Verbose output lists ICMP packets received in addition to Echo responses.

-t Continuous ping specified host until interrupted.

-n count Number of Echo Requests to send.

-s size Specifies the number of data bytes to be transmitted.

Host The IP address or host name of the destination system.

A few items concerning the above ping options warrant discussion. First, the -t option when implemented results in a continuous sequence of ping being transmitted to the destination system. Figure 2 illustrates the use of the -t option to ping the White House Web server. Note that the startup line returns the IP address and indicates the default use of 32 data bytes transmitted during each ping operation. Because the use of the -t option represents an elementary denial of service attack method, many organizations program their router access lists to deny the flow of ICMP Echo messages through the router. In such situations, ping requests will time out and the lack of response does not necessarily indicate that the destination is not operational.

When issued behind a router or firewall, ping is a useful mechanism to check the status of the TCP/IP protocol stack, adapter, and cabling to the network. When adding a new networking device to a hub, you can easily verify its accessibility by pinging it from another station on the network. If you do not receive a response from the newly installed device but can ping other devices on the network, this will indicate it is time to double-check the installation. Is the protocol stack up and running? Did you bind the stack to the correct network adapter? Is the cable connected to the adapter and hub port?

Returning to the ping options listed in the command format, the -n option enables you to transmit a specified number of Echo Requests before concluding the application. Some implementations of ping use a default count of 3 or 4 Echo Requests, while other implementations run continuously until interrupted via the use of a CTRL-C from the console. The -s option permits you to specify the number of data bytes to be transmitted during each Echo Request. Note that the ICMP header adds 8 bytes, resulting in an ICMP packet size equal to the data byte count plus 8.

Another difference between implementations of ping concerns the presence or absence of summary statistics. Although many implementations of ping provide a summary of Echo Request responses, some implementations lack this feature, requiring the user to put pen to paper. A typical summary indicates the number of packets transmitted, packets received, percent packet loss, and the minimum, average, and maximum round-trip delay. Note that the round-trip delay time can be a bit deceptive if you enter a host name instead of an IP address. If the host name to IP address resolution was not previously performed and stored in cache on your local DNS, the first ping will require a bit more time since the IP address must be determined. Thus, if you need to determine latency through the use of ping, it is advisable to issue the command twice and use the second set of results.

From Here to There

When ping times out without a response, it is quite possible that the destination is operational but that a problem exists on the route to the destination. When this situation arises, it is probably time to consider the use of another popular network utility application - traceroute.

traceroute, commonly accessed via the command-line entry tracert, provides information about the route that packets take from the local host issuing the command to the destination in the command. The general format of the traceroute command is as follows:

traceroute [-w timeout] [-q packets] [-h max-hops] host

where:

-w Represents the amount of time in either seconds or milliseconds to wait for an answer from a router

-q Represents the number of UDP packets transmitted with each time-to-live setting.

-h Represents the maximum number of hops to search for the destination host.

host Represents the IP address or host name of the destination.

traceroute typically operates by transmitting a sequence of three UDP datagrams to the destination using an invalid destination port address and a Time-To-Live (TTL) field value of one. The TTL value of 1 results in the datagram expiring as soon as it reaches the first router in the path to the destination. The router will respond with an ICMP Time Exceeded Message (TEM) indicating that the datagram expired. The typical default use of three datagrams provides three round-trip delays to each router on the path to the destination. In response to the preceding TEM, another sequence of three UDP messages are transmitted by the application using a TTL value of 2. This results in the second router in the path to the destination returning another ICMP TEM. This process continues until either the destination is reached or the default or set maximum number of hops permitted by the program is reached. Once the destination host is reached, the use of an invalid port address results in the return of an ICMP Destination Unreachable Message. The return of this message informs traceroute to terminate its operation.

Similar to ping, some organizations are now filtering traceroute ICMP messages. However, you can still use this application to check the status of the path to a destination router. This means that even when packet filtering precludes reaching a host, you can still use traceroute to obtain valuable information about the path to the destination.

Figure 3 illustrates the use of tracert to trace the route to the Yale University Web server. By examining the response to traceroute, you can obtain a significant amount of information concerning the structure of other networks linking your organization on the path to the destination host. For example, the first hop takes your path from your organization's router (192.131.175.2) to a BBN Planet router in Atlanta whose interface has the IP address 4.0.156.85. From Atlanta, another hop within the BBN Planet network remains in that city (IP address 4.0.3.253) but provides the route to a router located in Vienna, Virginia. By examining the routers in the path, you will note the Vienna BBN facility is connected to another BBN facility in New York City, where packets destined for Yale are apparently routed over an ATM connection to Boston and then to the university. Once the packet reaches the router at Yale's connection to the Internet, it requires three additional hops to reach that university's Web server. Note that from the last line in the traceroute response, the host's real name is not www.yale.edu, but elsinore.cis.yale.edu.

If you have a corporate internal IP network, you should know the routes from source to destination. This makes traceroute ideal for determining the operating status of various components across your network. Even when you are at the mercy of the Internet, traceroute can help you immediately determine the status of your organization's connection to the Internet. If your organization has a leased line linking your LAN to the Internet, you can use either ping or traceroute to determine the status of the distant end of the connection and the transmission facility connecting you to your ISP. If you previously noted the structure of your ISP's infrastructure, you can use traceroute to verify the operational status of your ISP. In some situations, you can use this knowledge to separate the wheat from the chaff if your ISP claims the problem is at a "higher level" on the Internet but traceroute indicates otherwise.

Another popular use of traceroute is to determine the practicality of virtual networking via the Internet. If your organization already has several sites connected to the Internet, it is better to first investigate the route and potential delays between sites than to purchase VPN equipment only to find that the routing infrastructure and resulting delays make the interaction between sites resemble the Long Island Expressway on a Friday afternoon.

nslookup

A third TCP/IP network-related application that can be extremely useful is nslookup, where ns represents name server. This application provides the ability to display information from Domain Name System (DNS) name servers. The general format of the nslookup command is as follows:

nslookup [[-option...][host to locate] \
  -[server]]

Most implementations of nslookup support two modes of operation - interactive and non-interactive. Entering the command without any parameters or following the command with a hyphen (-) instead of a host name or address typically results in the program operating in its interactive mode. When this occurs, the nslookup program will first return the name and address of the default name server configured in your TCP/IP protocol stack and then display the angle bracket (>) to indicate it is waiting for a command. In non-interactive mode, you would enter one or more options followed by the host or IP address of the computer you wish to look up, optionally followed by the DNS name server to use. Omitting the second option results in the use of the DNS address in your TCP/IP configuration as the default.

If you only need to look up a single piece of information, you would probably use the non-interactive command mode, while the interactive query mode is more suitable for repeated queries. Figure 4 illustrates the non-interactive use of nslookup to find information about Yale University Web server. Note that because a server name or its IP address was not used, the default DNS whose address is 192.131.174.1 was used.

The power and utility of nslookup results from the large number of options it supports. Because it is designed to retrieve information about DNS records, it can be used to obtain information about mailbox domain names, the target domain's start-of-authority record, the address of mail exchangers, and a variety of user information. Because the results of nslookup can provide unscrupulous persons with information they more than likely shouldn't obtain, many sites refuse nslookup queries. This situation is illustrated at the top of Figure 5, which indicates the use of the nslookup interactive mode. After commencing the command with its name followed by a hyphen, the application returns the name and IP address of the default DNS that will be used (in this example, serv1.abc.edu, whose IP address is 192.131.174.1). Next, an example of the use of the nslookup set query type (q) command follows. Through the use of the seq q= command, you can retrieve information about different DNS record types, ranging from a domain's start-of-authority (SOA) record to mail exchanger (MX) and mail list (MINFO) information. In this example, the q set q = mb command was used in an attempt to retrieve mailbox records. Next, the command ls was entered to list addresses that correspond to the query time.

For persons not familiar with nslookup, it should be noted that some implementations of the application permit you to follow the ls command with the type or types of records of interest, such as A, CNAME, MX, etc. For specific information on record types, the retrieval of RFC 1035 is recommended. Returning to the example in Figure 5, note that Yale's name server is configured to reject nslookup queries. However, I was able to access records on the name server of a government agency. To protect the privacy of the agency, the organization's domain is shown as spy.gov. Note that the ls command returned both the name server for the government agency and its ISP as well as the IP addresses and host names for each device in the domain that has an A record entry in the name server. Note an A record is simply a host address record. Also note that you would use Exit to gracefully exit from the nslookup interactive mode of operation.

One of the uses of nslookup is to review the records on a distant name server. By carefully examining information about certain records, you can typically isolate such problems as lost or non-delivered email as well as reasons why IP addresses cannot be resolved. Because ping, traceroute, and nslookup are built into most operating systems, they provide you with a trove of readily available network management tools. By learning when and how to use them, you may be able to isolate and resolve a variety of problems without needing more sophisticated network management program.

About the Author

Gilbert Held is an award-winning author and lecturer. Gil is the author of more than 40 books and 250 technical articles. Some of his recent books include Using Network Based Images, Understanding Data Communications 2ed., Ethernet Networks 3ed., Virtual LANs and Data and Image Compression 4ed., all published by John Wiley & Sons. Gil can be reached at: 235-8068@mcimail.com.