Monitoring and Optimizing NFS Performance
Robert Berry
Introduction
The network file system (NFS) is an essential part of
your LAN communications,
and its transparency to the network users is directly
proportional
to its response time. Finding that optimum performance
can be difficult,
and therefore, performance tracking and optimization
of NFS can be
a challenging task for systems administrators. The UNIX
environment
provides tools that enable you to gather the statistics
needed to
pinpoint areas in which to improve network response
time. As a first
step, you must familiarize yourself with the network
resources and
the users of those resources. Then compile the statistics
on the current
configuration, and determine if the statistics provide
you with good
news or bad news. Finally, if you determine that the
news is bad,
you must identify what the bottlenecks are, based on
the statistical
results.
Familiarize Yourself with Your Network
You may believe you already know your network inside
and out. But
take a moment and think. Are you familiar with the type
of work conducted
by each client on the network? Do you know the types
of RPC requests
that are typically generated by this work? Do you know
which servers
provide most of the resources for each client? Do you
know the network's
busiest time? Its lag times? Essentially, you need to
know who does
what, where, and when on the network.
Information of this sort helps you to model your networking
environment.
If you understand the nature of the network's workload,
you should
be able to develop an extremely accurate representation
of your system
when you try to create a set of benchmarks.
Gather Stats on Your Current Network Configuration
The UNIX environment supplies you with numerous tools
for gathering
network statistics. Some of the more useful are netstat,
nfsstat,
vmstat, iostat, uptime, and spray.
The following sections show an example of each (except
for vmstat
and iostat) and explain their usefulness for collecting
information.
vmstat and iostat were explained thoroughly in a previous
issue (see Bill Genosa's "Monitoring Performance
with iostat and
vmstat," March/April 1994, p.6).
netstat
The command-line syntax is:
netstat [-n] -i
Figure 1 demonstrates a sample output.
The netstat utility gives you information on the reliability
of your local network interface. The first column of
the output is
the device name of your network interface. The second
column, Mtu,
represents maximum transmission unit. The third column,
Net/Dest,
is the actual network to which your interface is connected;
this will
be the actual numbered address if the "-n"
option is used.
The Address column (column four) displays the local
host's
name or, again, the actual IP address if the "-n"
is used.
The remaining columns display the number of input and
output packets,
as well as the number of errors that occurred with each.
The Collis
column displays the number of times a collision occurred
each time
the host transmitted.
The input and output error columns are of most concern
here. A high
number of input errors could result from electrical
problems, from
corrupt packets being received from another host with
a damaged network
interface, from damaged cables, or from a device driver
that has an
improper buffer size. A high number in the output error
column may
indicate a problem with your own network interface.
This analysis assumes that your network as been up and
running for
some time. However, a high number of errors could show
up in either
category if your system has just recovered from a network-wide
power
outage -- particularly if you have many diskless clients.
The key
word here is high: both input and output errors should
be as close
to zero as possible. Still, there will usually be some
errors present,
especially if you have recently disconnected and reconnected
cables
or if your network has periods of intense traffic.
The number in the collision column will likely not be
zero, but should
be a low number relative to the number in the output
packet column.
You can calculate the percentage of collisions observed
by a particular
host by dividing the number in the collision column
by the number
in the output packet column and multiplying the quotient
by one hundred.
Hal Stern, in Managing NFS and NIS, (O'Reilly &
Associates,
Inc.), suggests that a collision rate of over 5 percent
indicates
a congested network in need of reorganizing.
A collision rate can also be obtained for the entire
network. To calculate
this you would add all hosts' output packet columns
and all hosts'
collision columns, divide the latter by the former,
and multiply by
one hundred, as above. This method is more appropriate
than taking
the sum of all the collision rates for each individual
host and dividing
by the number of hosts, because by this method the busier
hosts will
weigh more heavily on the average than the less busy
hosts. Again
according to Hal Stern, if the rate is greater than
10 percent, your
network is ripe for partitioning.
One caveat is in order here. If you notice a host with
significantly
more collisions than a similar host with similar network
usage, this
may be an indication of electrical problems rather than
network congestion.
nfsstat
nfsstat displays statistical information concerning
the status
of your NFS and remote procedure calls (RPCs) for both
the server
and client aspect of your system. Each field in the
output is a window
into the heart of your network operations.
The command-line syntax has three useful forms:
nfsstat -s
nfsstat -c
nfsstat
The first form will display server side statistics only;
the second
will display client side statistics only; and the third
will display
both server and client side statistics, respectively.
Sample output
for the first two commands is shown in Figure 2 and
Figure 3. The server
display indicates how successfully your server is receiving
packets
from each client. The fields in the display are as follows:
calls -- Indicates the number of RPC calls
received.
badcalls -- Indicates the number of calls
rejected by the RPC layer. Such a rejection would be
generated by
an authentication failure. It also includes the combined
totals of
the badlen and xdrcall fields.
nullrecv -- Indicates the number of times
a nfsd daemon was scheduled to run but did not receive
a packet
from the NFS service socket queue.
badlen -- Indicates that the server received
RPC calls that were too short in length.
xdrcall -- Indicates that RPC calls were
received that could not decode the XDR headers.
The client display indicates how successful your client
is in communicating
with all the NFS servers. The fields in this display
are as follows:
calls -- Indicates the total number of calls
made to the NFS servers.
badcalls -- Indicates the number of RPC
calls that returned an error either by timeouts or because
of an interruption
of the RPC call itself.
retrans -- Indicates the number of times
a call had to be retransmitted because there was no
response from
the server.
badxid -- Indicates the number of times
a reply from a server was received which didn't correspond
to any
outstanding call. When a request is generated it is
given an XID.
At any one time, there are several calls requesting
services on any
number of servers. Occasionally, a response is received
with an XID
that has already been serviced. At this time badxid
is incremented.
I will discuss the significance of this field later.
timeout -- Indicates the actual number of
calls that timed out waiting for a server's response.
wait -- Indicates the number of times a
call had to wait because a client handle was either
busy or unavailable.
The remaining fields of the client RPC section are not
relevant to
the current topic, and are omitted from the discussion
here.
uptime
uptime is a simple tool that allows you to get the current
time, the amount of time the system has been up, the
number of users
on the system, and the three load averages (see Figure
4). The three
load averages are a rough measure of CPU usage over
1-, 5-, and 15-minute
intervals.
What's considered high for these three categories depends
on the number
of CPUs on your system and whether or not your tasks
are CPU-intensive.
leen Frisch, in Essential System Administration (O'Reilly
& Associates, Inc.), notes that any value under
3 would not be critical.
spray
The commandline syntax is:
spray hostname [-c count] [-l length] [-d delay]
spray reports the number of packets sent to a
particular host; the time needed to send those packets;
the number
of packets received by the host; and the number and
percent of packets
that were dropped by the host (see Figure 5).
spray is a useful but somewhat limited tool. The output
gives
the number of packets that didn't make the distance,
but it doesn't
indicate at what point in the network the packets were
lost. Another
limitation is that in the real world, packet sizes can
vary and usually
occur in random bursts. But by default, spray sends
1162 packets of
86 bytes in length. With the "-c" and "-l"
options,
you can minimize this limitation by varying the number
and size of
packets. With the "-d" option, you can even
simulate some
delay between packets.
Running spray from each of your machines will give you
a good
estimate of a server's performance capabilities and
of the speed of
a particular machine's network interface. You may find
that a server
that receives a large portion of the network traffic
has a slow network
interface; you might then decide to move the file systems
to a faster
machine or provide it with a faster network interface.
Obtaining NFS Benchmarks
You can use the UNIX tools described above to measure
your network's
performance under normal conditions. This will give
you a set of benchmarks
by which to judge your system. This will be handy the
next time a
user comes up and complains about the network being
sluggish. Simply
run the test again and compare it with past results.
The key here is knowing what "normal" is on
your network.
This is the point where being completely familiar with
the network
workload is important. The benchmarks will serve no
purpose if they
do not accurately represent the type and proper proportions
of RPC
requests, commonly generated on the network.
To produce benchmarks for your system, you may purchase
any one of
many NFS benchmark traffic generators or you may build
your own using
UNIX utilities. I chose the latter in this case.
Certain UNIX commands can generate the same RPC requests
that are
normally generated by the work conducted on your network.
Generating
a NFS RPC mixture in this fashion can be far more flexible
than using
a ready-made package. These packages are inflexible,
incapable of
changing to fit changing workloads. With a script, as
the nature of
the workload changes on your network, you can reflect
the changes
in the script.
NFS Traffic Generation Script
The first step in creating your own NFS traffic generation
script
is to know which RPC requests are generated by the work
conducted
on your network. To get a listing of the NFS RPC percentages
generated
by your network, run the nfsstat utility on each of
your servers.
This information will help you build a script that comes
as close
as possible to an accurate representation of your network
usage.
Next, you will need to know what UNIX utilities generate
what RPC
requests. Figure 6 gives a sample of some basic utilities
and the
NFS RPC requests they generate. Use a combination of
these in your
script to generate the NFS traffic for your benchmarks,
paying close
attention to the NFS RPC percentages reported for your
network.
Listing 1 provides an example of an NFS traffic-generating
script.
This example is simple, but keep in mind that an NFS
traffic generation
script can be whatever you want it to be, as long as
it closely represents
your network workload. For instance, in this scenario,
the network
is a UNIX network where large CAD and raster files traverse
back and
forth across network lines. Under heavy network usage,
RPC request
percentages on a particular client will be approximately
50 percent
reads and 40 percent writes, with the remainder divided
among various
other RPC requests, such as getattr and lookup.
The sample script starts with an uptime report, to give
you
an indication of your CPU usage. This is not essential;
I added it
to give an overall picture of the network. What is necessary
is that
you become superuser before running this script. The
reason for this
is that the script will next run the nfsstat utility
and display
the current RPC requests percentages for the client
before reinitializing
all the percentages back to zero with the nfsstat "-z"
parameter. The nfsstat utility requires you to be superuser
in order to use the "-z" parameter.
The meat of the script is the series of cp commands.
To generate the
50 percent reads and 40 percent writes, the script copies
a large
file within an NFS directory and then copies it once
from the NFS
directory to a local client directory and then once
more. You may
have to experiment to achieve the desired RPC percentages.
For example,
when this script was being built, it turned out that
the client was
able to cache fairly large files. With the file located
in the cache,
four or no disk reads were being requested. To get around
this, the
file had to be made very much larger -- in the case
here, it was
12Mb in size.
Finally, the script performs some cleanup and generates
another uptime
report along with the final nfsstat client report to
check
the RPC request percentages produced. Also added for
good measure
is the spray utility. The script runs spray on each
server to give you some idea of the server's current
packet handling
capabilities.
Remember that your script can be any sequence of UNIX
utilities as
long as they reflect the RPC requests generated by your
network's
workload. I used the cp utility here because it generates
read and write RPC requests (see Figure 6). You will
need to experiment
with combinations of utilities to meet your own requirements.
It is
also a good idea to run your script at various times
over several
days to see if it will produce close to the same results
each time.
Possible NFS Performance Bottlenecks
When examining possible performance bottlenecks on your
network, keep
in mind that there are two sides to the network: the
server side and
the client side. Are the server hardware and software
inadequate for
the client's jobs, or are the client's jobs too numerous
and difficult
for the server hardware and software?
Server
On the server side, a number of key hardware components
can cause
bottlenecks and should be watched closely. I mentioned
earlier the
network interface itself, but some others you should
consider on a
server are the CPU, memory, and the hard disk.
Regarding the CPU, the concern is not so much the speed
of the CPU,
although faster is better, but how fast jobs are scheduled
for CPU
usage. A potential bottleneck is an increased latency
in scheduling
NFS daemons. nfsd daemons have kernel process priority,
and
under normal conditions, nfsd daemons are run by the
CPU immediately
upon an NFS request. But if the server has a number
of I/O interrupts
or other kernel priority calls running, NFS requests
can build while
nfsd daemons are waiting for CPU time. A solution might
be
to limit local access to a server to reduce the number
of I/O and
kernel priority system calls.
iostat and vmstat provide useful information on CPU
job loading.
The main concern regarding memory as a bottleneck is
to ensure that
the server has enough to handle all its processes. This
will reduce
page swapping, which can interfere with NFS services.
With hard disks, as with CPUs, the bottleneck is caused
not so much
by the speed of the drive (although, once again, the
faster the better),
but the overloading of NFS disk access requests. If
you have a disk
that receives more than its share of NFS requests, you
might want
to consider spreading the heavily used filesystems over
several disks.
Client
In some instances you might discover that the server
isn't the bottleneck
of the network. In fact, it might turn out that there
is no bottleneck
at all, there is only a client that wants too much in
too little time.
If this is the case, then some constraints must be placed
on that
client.
A client sends an NFS request to a particular server.
If it doesn't
receive a reply within the allotted time period, the
request will
timeout and be retransmitted. The client does not respect
the fact
that you've tuned the server to the best of its hardware
capabilities.
It doesn't care if the request is still queued on the
server and will
be served eventually. All it knows is that it didn't
receive a reply
in the allotted time, so it sends the request again.
The server will
then respond even more slowly as NFS requests build.
You may see an indication of this problem with the nfsstat
utility. If you run nfsstat with the "-rc"
flag and
you notice a large number in the badxid field and an
even
larger number in the timeout field, then it is likely
that
your client is demanding too much from your server.
A simple correction
for this problem is to increase the timeout parameter
in the mount
utility.
Conclusion
Monitoring and optimizing NFS performance is a challenging
process.
UNIX provides you with useful tools to perform this
task. Each of
the tools covered here provides extensive capabilities,
of which only
a small sample were touched upon in this article. I
suggest that you
experiment with these tools and develop your own cause-and-effect
analysis of NFS performance.
Bibliography
Frisch, leen. Essential System Administration. Sebastapol,
CA: O'Reilly & Associates, Inc., 1991.
Peek, Jerry., Tim O'Reilly, and Mike Loukides. UNIX
Power Tools.
Sebastopol, CA: O'Reilly & Associates/Bantam Books,
1993.
Stern, Hal. Managing NFS and NIS. Sebastopol, CA: O'Reilly
& Associates, Inc., 1992.
About the Author
Robert Berry has been working with SunOS and DG/UX
since 1991.
He received his BS degree from the University of Maryland
and is working
on an MS degree from the University of West Florida.
He is currently
the Systems Administrator and Networking Manager at
Spectrum Sciences
& Software, Inc. His interests are in PC-to-UNIX
networking and network
programming. Robert Berry can be contacted at 242 Vickie
Leigh Rd.,
Fort Walton Beach, FL 32547. Fax (904) 862-8111.
|