Practical Packet Sniffing
The packet sniffer is an extremely useful, and often overlooked, systems administration tool that helps resolve not only complex network issues, but also application-level problems. Every systems administrator, whether he or she is responsible for networking or not, should be familiar with the workings of packet sniffers and should have one in his or her troubleshooting toolbox.
This article will provide a basic overview of packet sniffers and how to use them, while also providing several practical examples. I will then discuss a series of simple case studies, detailing real-life problems that I have encountered as a systems administrator, and how packet sniffers were used to isolate the root causes of those issues. The goal is that, by the end of this article, you will possess skills and tools that can be applied right away.
My focus will be on Ethernet networks -- almost exclusively on the TCP/IP protocol suite -- and all of my examples will be for tcpdump and snoop (q.v.). I chose tcpdump because it is available in the public domain, and snoop because it is part of the Solaris OS, and has several useful features that I am familiar with.
One word of caution: packet sniffers can reveal sensitive information, particularly when you are viewing the actual application data. It is easy to read passwords, private email, and even NFS file operations with only rudimentary knowledge. For this reason, it is extremely important that you:
1. Never use a packet sniffer while a user (i.e., non-systems administrator) is watching.
2. Only view as much data as necessary to solve the problem at hand.
3. Follow your local system policies regarding directed, network monitoring before proceeding.
A Brief Introduction to Packet Sniffers
Packet sniffers like tcpdump and snoop are designed to capture and analyze individual network packets. Normally, a specific machine only sees those packets that are destined for it, and a particular application only receives packets that are a part of its session. The packet sniffer, however, will by default place the specified network interface in promiscuous mode, which means that it will capture all packets traveling on the wire, regardless of source or destination. To perform this feat, the program must have the ability to open and read from the network device. On a UNIX system, it must be running as root, because a packet sniffer can easily be used to gather passwords and other sensitive information -- not a capability you want in the hands of users. Unsecure operating systems, such as DOS and Win95/98, allow users to perform any action with no restrictions, making it possible for general users to run packet sniffers indiscriminately. Caveat, sysadmin.
However, you should be aware of some limitations. Most importantly, a software-based packet sniffer can be quite CPU intensive, particularly if a complex filter is applied against a busy network interface. Sniffing on critical machines (file servers, mail servers, etc.) can have a detrimental affect on their performance, and make you unpopular if done indiscriminately for extended lengths of time. Also, packet sniffers show you what is happening on the network only for a given point in time. Unlike more general network monitoring systems, the basic packet sniffer gives you no information regarding trending or overall utilization. Because of these shortcomings, the packet sniffer is most often employed when looking for a specific problem or trying to analyze behavior in a specific network session.
In this article, I will provide examples for two sniffers -- tcpdump and snoop. Both of these are introduced, briefly, below:
tcpdump is a freely available sniffer available from the Lawrence Berkeley National Laboratory's Network Research Group. The source distribution can be obtained from ftp://ftp.ee.lbl.gov/ along with the libpcap library, described as a system-independent interface for user-level packet capture. You will need libpcap to compile tcpdump. Users of Linux and the free BSD variants for the PC may conveniently find tcpdump included as part of their distribution. Users of AIX may not be able to run the public domain version of tcpdump because of difficulties making the OS work with libpcap. AIX does, however, ship with a version of tcpdump based on an older release. Users of HP-UX may need to do some porting to get tcpdump and libpcap to compile, but users of 10.20 can obtain a patched source distribution at:
snoop is part of the Solaris 2.x distribution, and has some excellent features that make it a very capable packet sniffer. At the basic level, tcpdump and snoop are functionally equivalent; there are subtle differences, however, in their filter syntax and data output. I prefer snoop to tcpdump, and recommend its use on the Solaris OS.
Using Your Packet Sniffer
The best way to learn the ins and outs of your packet sniffer is to read the man page and practice with various options and expressions. In this discussion, I won't attempt to duplicate what is found in the man pages for snoop and tcpdump. Instead, I will cover what I believe to be their most useful options and provide a simple introduction into how filter expressions are built. The man pages for both of these tools are lengthy, and discuss the syntax and options in great detail. However, they must be read carefully -- the common practice of simply skimming for information can quickly get you lost.
Useful Options for tcpdump
-N -- Prints the short form of host names instead of FQDNs.
-i interface -- Specifies the network interface to listen on.
-l -- Buffers the stdout line. This handy when you want to watch the data display while capturing it.
-p -- Specifies that the sniffer not put the interface in promiscuous mode. This is useful when you only want to monitor traffic to or from the host that you are logged into. This also prevents the sniffer from looking at other traffic on the wire.
-r file -- Reads packets from file (which was created with the -w option)
-s snaplen -- Captures snaplen bytes of data from each packet, rather than the default of 80. It's necessary when you need to see the actual application data within the packets.
-w file -- Writes the raw, captured packets to file, which can be read using the -r option.
-x -- Prints the contents of each packet, in hex, minus the link-level header. This is what actually lets you view the packet contents.
Useful Options for snoop
-P -- Do not put the interface into promiscuous mode -- like -p in tcpdump.
-d device -- Specifies the network device to listen to -- like -i in tcpdump.
-i file -- Reads packets from file (which was previously created using the -o option).
-o file -- Writes the raw, captured packets to file.
-s snaplen -- Truncates the packets after snaplen bytes. Note that this is different from tcpdump, in that snoop will read the whole packet if -s is omitted (tcpdump only reads 80 bytes by default).
-t [ a | d | r ] -- Displays either an absolute, delta, or relative timestamp for each packet.
-x offset [ , length ] -- Prints the contents of the packets in both hexadecimal and ASCII, starting at byte offset and, optionally, up to length bytes. If length is omitted, the rest of the packet is displayed.
Basic Filter Expressions
Though tcpdump and snoop have slightly different syntax, simple expressions can be built using a common approach. Each expression can be thought of as consisting of three primitives, all of which are optional:
program [ options ] [ protocol ] [ direction ] [ type ] < id >
In the above example, the program would be either tcpdump or snoop, and the options would be the desired command-line switches to the sniffer. The actual filter expression consists of the arguments for protocol, direction, and type. If the sniffer is run with no arguments, the default behavior is to report all packets captured on the given interface:
Protocol specifies the protocol of interest and limits the reporting of packets to those matching that protocol. Typical values for protocol might be: ip, arp, rarp, tcp, udp, etc.
Direction limits the reporting of packets to those headed to, from, or between a host or pair of hosts. Typical values for direction might be src, dst, src or dst, or src and dst, but the exact set of legal values depends upon your sniffer. Read the man page for more information.
Type identifies what ID is, and would typically be host, net, port, etc. Again, exact syntax depends upon the sniffer being used. Generally, if type is omitted, ID is assumed to be a host name or IP address.
Additionally, primitives can be combined with parentheses and boolean expressions, such as and, not, and or. Each sniffer may have special keywords, such as multicast, broadcast, and arithmatic expressions, and advanced filters can even be built that examine specific bytes within the packet itself. Again, the man pages provide detailed information on the specific syntax for your sniffer.
1. snoop host gpws and port 23
This monitors all traffic to and from host gpws on port 23, which has the effect of monitoring all telnet sessions on this machine. snoop is particularly good for this task, since it decodes the first few bytes of the packet for you, displaying them in ASCII without your specifying any options. This makes it relatively easy to monitor a given user's login session -- including when they enter their login password -- and makes a good argument for deploying ssh.
2. tcpdump -s 194 host nfssrv2 and port nfs
This shows all NFS traffic to and from nfssrv2 on the local subnet. Note that a snap length of 194 is required for the sniffer to completely decode NFS packets (the man page for tcpdump incorrectly states that 193 is the minimum size). This command will give you the NFS procedure, as well as target filehandles and filenames.
3. snoop -x 54 host www and port 80
This displays the contents of HTTP packets going to and from the server www. When using tcpdump, increase your snap length to see the entire contents of the packet.
4. tcpdump -x port smtp
This shows all mail (SMTP) traffic on the local subnet. If run on a mail hub, this will grab all mail traffic to and from the machine. This demonstrates how easy it is to read users' email, and makes a good argument for using PGP for sensitive messages.
Hints and Tips
Simply knowing how to use your packet sniffer may not be enough. A well-constructed filter, for example, can be worthless if it is run on the wrong machine, or if the captured data can't be examined at a later time. In this section, I will point out some of the pitfalls in packet sniffing and discuss how to work around them.
In a traditional Ethernet network, every machine on the local network sees every packet on the subnet. Although this was simple and cheap network design, it created a very large security hole -- any machine on the subnet could essentially monitor all the traffic on that subnet using a packet sniffer. In a switched environment, the hub is replaced with a switch that directs packets to the particular segment connected to the destination host. The end result is that, in a typical switch configuration, a machine only sees the traffic sent or received by itself. This is good for security, but makes the business of legitimate packet sniffing by administrators quite a bit more difficult.
If you are in a switched environment, and you need to monitor traffic to or from a specific host, then you should log in to that host and sniff locally. If, for some reason, login access to the client is not possible, you can either sniff from the server, or set up a dedicated host on that segment to monitor the desired traffic. As a general rule, sniffing from the client machine is your best option, because it does not require extra hardware and won't create performance issues for the server machine.
A machine with multiple network interfaces is sometimes referred to as being multi-homed, since it exists on several different subnets. When sniffing on a multi-homed machine, you must first determine which interface is sending or receiving the desired traffic, then direct your sniffer at that interface. Note, however, that routing is not always intuitive or logical. It is completely reasonable for traffic to enter via one interface and leave through another. This means that a single sniffer may capture only half of the conversation between two machines. In general, when dealing with traffic between a multi-homed and single-homed machine, it's best to sniff from the single-homed one. If this is not possible, you may find it necessary to run multiple sniffers simultaneously, which can impair performance.
Using -s Wisely
The more data you capture from each packet, the more likely it is that your sniffer will drop packets during your session. Capturing packets is a CPU-intensive task, and heavy network activity can easily overwhelm a packet sniffer (and a CPU), particularly if you are using the -x option to display the packet contents. The purpose of the -s switch is to limit the amount of data that your sniffer must decode. By default, snoop will attempt to capture the entire packet, whereas tcpdump will only look at 80 bytes -- neither of these defaults are optimal for all situations. You should adjust your snap length for a particular session. Usually, that means capturing only as much of the packet that is needed for your investigation or analysis, particularly when on a chatty net.
Capture to a File
The best way to make the most of your packet sniffer is to capture the raw packets to a file, then run filters on the packet file. Running a broad filter and capturing a wide range of packets gives you the ability to fine-tune your filter expression after-the-fact and extract only the data you need. You will also have a permanent copy of the packets from that time interval, so you can run several different filters against it if you decide you need more (or less) information from the session. You can also recreate the output from your sniffer at any time, and in any format, that you need. Most importantly, however, is the fact that writing the raw packets to a file prevents I/O buffering problems. When you are running a packet sniffer live, you may not see the output from the most recent traffic; this typically means that the last packet or two from a particular session may never get displayed to your screen because the final packets of a conversation may not fill the output buffer for your tty.
Reading Application Data
The -x option allows you to view the packet data, which will show you what the particular application responsible for your packet is saying. With snoop, you must provide an offset to the -x option, which tells the sniffer where to start when displaying the packet contents. For a TCP/IP packet on an Ethernet network, the packet data starts at byte 54, so -x 54 will display the application data without showing the packet headers. snoop also provides you an ASCII representation of the packet data, which is handy for monitoring those protocols that speak English (SMTP, HTTP, POP, IMAP, etc.).
When using tcpdump, the -x option does not take an offset, so you will need to wade through several bytes of headers before seeing the actual application data. And, unlike snoop, tcpdump will not give you an ASCII representation of the packet data, showing only the raw hex output. I have provided a quick-and-dirty Perl script, called tcpdascii, which takes output from tcpdump's -x option and provides a snoop-like ASCII representation of the data. tcpdascii is not a very robust or flexible program, but it does work. Note, however, that piping live tcpdump output to tcpdascii amplifies the I/O buffering problems mentioned above. So, capture your packets to a file first, and then run tcpdump with the -x option on the capture file and pipe the output to tcpdascii. (tcpdascii is available from Sys Admin at: http://www.sysadminmag.com or http://ftp.mfi.com.)
Hardware Implementation of Protocols
Some hardware vendors will manufacture task-specific servers that implement a particular protocol in hardware for increased performance. A common example would be high-end NFS file servers, where the NFS requests are processed in hardware running a dedicated kernel. In these cases, where a server is processing a particular protocol at the hardware level, your packet sniffer may not see the traffic because it's not passed up to the main kernel for processing. In these situations, you may have no choice but to sniff from the client machines or set up a dedicated sniffer on the subnets in question.
Once you are familiar with your packet sniffer and how to wield it effectively, you can put it to use solving everyday problems. Naturally, the uses of a packet sniffer are wide and varied and can't all be covered in the space of this article. Instead, I will discuss some real-world issues that I have encountered and describe how packet sniffers were used to resolve them. These examples provide only a glimpse of what is possible, but serve as good examples because of their simplicity.
Note that the actual packet sniffer output is provided for most of these case studies, but usernames and hostnames have been modified to protect the privacy of the individuals involved.
Case Study #1: NFS server fs11 not responding
In this case, the user filed a request indicating that the error message NFS server fs11 not responding was repeating sporadically in his console window. To determine some sort of pattern, or cause-effect relationship between actions on the workstation (Sparc Solaris 2.5.1) and the error message, I logged on to his machine, ran snoop to examine the NFS traffic, and then ran ls against various remote filesystems exported by that NFS server. The output from this session is shown in Figure 1.
I saw that NFS retransmits were being generated when stat-ing certain filesystems (see the READDIRPLUS3 procedure calls). A look at /etc/exports on the fs11 fileserver showed that the filesystems responsible for the errors all had timeo=1 set in the parameter list, which is way too small for clients that run NFS version 3. This timeo parameter was a legacy setting for older filesystems, and it had never been removed.
In this situation, the problem could have been identified without the use of a sniffer, but it immediately pointed me in the right direction.
Case Study #2: Mail server configuration problem
Our support group received a request from a user claiming that there was a mail server configuration problem with our mail hub. He said the problem was that a single invalid recipient prevented delivery of the message to any of the recipients (i.e., one bad address on the To: or Cc: line supposedly caused the entire message to bounce without delivering to anyone). I was skeptical, since this is not how the SMTP protocol is designed -- a single bad address will not prevent delivery to the entire list of recipients. I also knew from experience that our mail server was running properly.
Since there was no problem with the mail server, my attention turned to the user's mail client. In this case, the user was running a Web server in his department, and was using an NT application that he downloaded off the Internet to send his email from a Web form. Since I could not log into the NT machine or examine the mail program's source code, I opted to run a network sniffer on the mail server and monitor the conversation between the client and server.
Figure 2 shows an excerpt from the SMTP exchange. Because our mail server is multi-homed, only the server half of the conversation was captured. It was enough, however, to identify the source of the problem. Packet eight in the session shows the reply from the server to the client -- a properly formed 550 error in response to the invalid recipient specified by the client. However, packet nine shows that the mail server closed the connection with a 221, indicating that it received an explicit QUIT command from the client.
I concluded that the mail server, was fine. The user's mail program was causing the problem. Instead of accepting the 550 (invalid recipient) error and moving on to the next recipient, the mail client incorrectly responded with a QUIT directive, aborting the SMTP session without sending the message for delivery.
Case Study #3: POP server is deleting my mail
In this problem, a user claimed that the POP server was deleting his email from the mail server, even though he had explicitly configured it not to do so. When he read his mail via POP, he could see all of his mail messages because they were downloaded to his local computer. Attempts to read his mail via UNIX mail clients (elm, pine, etc.) showed that his mail spool was empty.
To investigate this claim, I decided to run a sniffer on the POP server and watch the POP dialogue to see whether the client was explicitly telling the server to delete the mail in the mail spool. I discovered, however, that it wasn't necessary to go this far.
Figure 3 shows the relevant output from the very beginning of the POP session. Packet 19 shows the user login, and packet 21 shows the server response after the user's password was validated. In packet 21, the server is claiming that there are 983 messages in the user's mailbox. A look at /usr/spool/mail/<user> showed that the mail spool was, indeed, empty. So where was his mail?
The answer was in the documentation for the POP server, which states that the ~/mbox file will be used for storing incoming email, if it exists. A look at the user's home directory showed that he did, in fact, have an mbox file in his home directory, containing exactly 983 messages.
Case Study #4: Can't access Web server
Several users complained that they were receiving Document contains no data errors when trying to access a particular Web site on the Internet.
The site in question was not viewable in either Netscape or MS Internet Explorer. However, Lynx (a text-based Web browser) was able to bring up the Web pages on the remote server just fine. Given that Lynx worked, and Netscape and MSIE did not, my first question was: What was Lynx doing differently?
Suspecting that it might have something to do with the headers, I fired up a sniffer on our proxy server, and watched the conversation between the client, proxy, and destination server to capture the HTTP dialogue. Unfortunately, I do not have the packet sniffer output available, but it showed that the remote server was dropping the connection immediately after receiving the headers from Netscape and MSIE, whereas the Lynx session was allowed to complete.
By examining the headers, I saw that MSIE and Netscape were sending a Proxy-Connection: Keep-Alive line, but Lynx was not. Using telnet to connect from the proxy server to the remote Web server on port 80, I spoke raw HTTP to the server and discovered that the Proxy-Connection: Keep-Alive header was the culprit -- the remote server immediately dropped the connection without sending back a document. There was not much I could do, other than disable Keep-Alive support on our proxy server, until the remote site resolved this issue.
Putting it All Together
I have barely scratched the surface of packet sniffers and their applications. The point, however, is that packet sniffers can help guide you to the answers to many common systems administration problems. Remember that the packet sniffer is not a solution in and of itself -- you must have some understanding of the underlying protocols that you are monitoring, and you must be able to understand and interpret what you see. The basics outlined in this article should provide you with a solid foundation for exploring more complex issues and problems on your own.
About the Author
John Mechalas has a B.S. and M.S. in Aeronautical and Astronautical Engineering from Purdue University. He has worked at Intel Corporation for five years, where he currently manages a UNIX systems administration and security team for a large microprocessor design site. He can be reached at: firstname.lastname@example.org.