Making the Most of NFS
David S. Linthicum
Network performance problems are frustrating for users
managers alike -- users, because of sludgy response
managers, because the source of the problems may be
This article presents basic steps for making the most
of your NFS
system. It first explains how to monitor and understand
then addresses tuning and configuration issues. The
here is generic; you'll need to consult the manual for
Monitoring the Network Request Load
To develop a clear understanding of the demands on your
the network's request load. NFS requests are generated
when the network
performs activities such as loading a file into memory
from the server,
or making multiple I/O requests on the server drive,
which may be
part of normal database operations. NFS requests are
in that they randomly generate bursts, and the requests
in the bursts are usually dissimilar.
A network request load changes widely during the day.
you may find the network load is high in the morning
as users read
their news, and low at lunchtime while most users are
out of the office.
To get a complete picture, monitor the network over
a typical day's
Measuring the Network
The first step in analyzing performance is to benchmark
that is, to define the network's current level of performance.
benchmarking technique you select should closely reflect
the NFS call
rates as well as RPC (Remote Procedure Call) distribution.
procedure that provides a measurement under a steady
may not accurately reflect the real-life situation of
Your benchmark should represent your daily network requirements
closely as possible. Always consider your actual client
When using a benchmark, be careful of the cache effects.
on the clients may distort the benchmark results, since
and writes to the same files will generally be handled
by the cache,
and will sharply reduce the number of RPC requests made
to the network.
If possible, try to isolate clients when loading the
allows you to narrow down certain areas of the network
in the benchmark.
Network performance is measured by monitoring the average
time for the typical user on the network. Divide the
number of NFS
RPCs made by the time they took to complete. An RPC
communications by defining a standard message format
used by high-level
protocols. An RPC also provides the mechanism used to
define a remote
node's operations. You can use either the nfsstat program
a third-party network benchmark program, or you can
create your own
nfsstat tracks the number of RPC calls made over a period
time. This program comes with most NFS networks (at
least on the UNIX
side). The nfsstat command displays statistical information
pertaining to the network and the RPC interfaces. It
to note that the RPC interface relies directly on the
uses the following syntax:
-c Displays the client information.
-s Displays the server information.
-n Displays the NFS information.
-r Displays the RPC information.
When nfsstat is used to monitor average response
time, you should notice (at least at first) that as
the network load
is increased the average response time naturally goes
up. In fact,
the average response in the beginning closely follows
the number of
NFS requests that are made by users. After awhile, the
time line should smooth out. Generally, take a sampling
hours of typically heavy operations to get a good reading.
should be able to handle the peaks and valleys without
increase in response time, although when the network
load is suddenly
increased, the response time may increase until the
network has a
chance to recover. The server may require additional
time to recover
from large bursts, and this may leave the response time
At this time, users may encounter nasty messages such
as RPC timeouts.
The netstat program provides information on such things
UDP socket overflows and IP packets (see Figure 1).
assists in direct NFS diagnostics, netstat is helpful
tuning the server or spotting low level problems.
Places to Check
The performance measurements provide the administrator
about how the network is currently performing, but it
to identify particular problems caused by certain network
or "bottlenecks." There is no easy way to
spot network bottlenecks,
but the following list of potential network problems
should at least
tell you where to look.
Network Interface (cards) -- The network interface may
be sending or receiving packets due to a failure. Fortunately,
network interface cards go, they usually go quickly.
Some do linger
on and should be considered as suspect if network performance
Bandwidth -- The network may be congested. This slows
transmissions from the clients to the server. You may
spot this problem
by an over-abundance of RPC timeout errors. NFS networks
on corporate, or company-wide networks for transport
often show bandwidth
problems. In many cases, the only solution to this problem
is to take
your NFS network off the main network, and place NFS
on a network
of its own.
Repeaters, Bridges, Routers, and Gateways -- These network
components allow your network to connect up with other
or computers, but can potentially cause trouble. Problems
network components can be easily spotted using common
Server -- A few problems may be traced back to the server.
First, the server may be so bombarded with packets that
it can't handle
all of them. Second, the CPU on the server may be overloaded.
server is responsible for scheduling an nfsd daemon
to perform the
network operation. If the server is under-powered, and
can not handle
the number of daemons required, then network performance
In addition to the CPU cycles required, the amount of
memory may be a problem. If the amount of memory is
low, and the server
is required to handle a large NFS network load, then
system may begin to page, drastically reducing the server
Another possible server bottleneck is the disk bandwidth,
or the ability
to get information off of the disk into the client memory.
is compounded by the fact that NFS write operations
circumventing caching operations and disk controller
An NFS network is multidimensional. There are several
areas to look
at to increase performance or to find a performance
problem. A poorly
performing network may be caused by a loose network
server that desperately needs a memory upgrade, a router/bridge/gateway
problem, and so on. Be absolutely sure that a hardware
the problem before actually replacing it. If you have
hardware around, try swapping out the suspect component
network performance again. If there is little or no
If all of the network components check out okay, then
it is time to
look at the server configuration as the culprit. If
monitoring has determined that your NFS performance
and the hardware components all seem to be operating
usually means the NFS server is substandard: not able
to handle new
or normally scheduled requests.
First, look at the cpu. NFS does not usually constrain
the cpu. The
nfsd daemon reads and decodes an RPC request and not
is required from the cpu by NFS. The problem occurs
when the cpu handles
other operations besides NFS network processing. UNIX
is a funny beast.
It will do all that is asked of it, with no complaints,
but it does
place a strain on components. Generally, NFS servers
also act as mail
servers, print servers, and terminal servers. This may
cpu with high priority or interrupt level processes.
The NFS daemons
run with kernel process priority, therefore they run
with a higher
priority than other user processes, such as applications.
kernel level processes are present, the latency to schedule
daemons is greater because the daemons do not get the
cpu as soon
as required. This latency in scheduling means decreased
performance -- the nfsd daemons are not as responsive
the requests as they should be.
It is also important to note that, since the NFS daemons
processes, they make user process performance (such
as database processing)
unacceptable. To free up your cpu for NFS processing,
make sure all
extraneous activities (such as terminal processing,
bridges) are moved to other cpus on the network, if
find NFS servers convenient nodes on the network to
place these devices,
but too many kernel level processes do not an effective
make. If your NFS server is performing too many other
may find NFS performance unacceptable. Use sar or vmstat
cpu utilization. Sar also monitors buffer and inode
it does not replace netstat or nfsstat (see Figure 2).
The most significant aspect of NFS server tuning is
number of server daemons to run. This shows a classic
many and cpu performance suffers; too few and network
suffers. The goal is to find a happy medium. As a rule,
have two nfsd
daemons for each concurrent disk operation. For some
run two disks on the same controller and can schedule
two disk operations
at the same time, four nfsd daemons are recommended.
point, check UDP socket overflows using netstat (see
If you see any UDP socket overflow, add daemons,
and check again.
Stop adding daemons when socket overflow is nominal.
Next, check network performance. If NFS performance
begin adding daemons again. Make sure you stop when
the server's load
average increases without a corresponding network performance
Make sure you test your network under normal user load.
programs are fine if you have no other choice, but a
load is best. This, again, is a trial and error procedure.
find the optimal number of NFS daemons, later you may
have to alter
this number as users and the user application load increases.
NFS takes advantage of the server's buffer cache subsystem.
As a rule
of thumb, you may want to increase number of pages allocated
memory from the default. Ten to twenty percent of total
memory is most likely your range, but it depends on
the server load.
Note that when you allocate memory for cache, you reduce
memory for user processes. If your server also provides
services, your users won't be happy.
Sometimes the problem with NFS performance is the server's
to process NFS disk requests. In this case, consider
disk tuning factors, such as adequate kernel table sizes,
balance of the disk requests for all disks on the network,
as the performance of the disk as a device. As a rule,
use the fastest
disk available on the server, and make sure that the
disk is optimally
tuned. NFS read and write operations usually can't take
of caching or disk controller optimization, such as
one would find
in SCSI and BUS Mastering technology.
File system fragmentation can further cause a performance
This is caused by large files, such as database files,
that are not
contiguously allocated on disk. This means the read/write
to move back and forth over a larger area to gather
in the file. UNIX systems normally do not come with
software. Therefore, the only remedy to this problem
is to dump the
entire file system to tape and reload.
A big directory may also cause a problem with NFS performance
directories are searched linearly during lookup operations.
required to find a named directory is proportional to
the size of
the directory and where the name is contained. Therefore,
names contained in a directory, the longer it takes
to search the
directory. Also, beware of symbolic links that point
to other symbolic
links. UNIX resolves each one as another lookup, which
time to resolve than a single symlink to an inode.
Some disk problems can be solved by reallocating disk
The administrator must balance the load, usually by
giving a heavily
used file system it's own (hopefully faster) disk. With
disk, requests can be serviced in parallel. Using utilities
iostat, administrators can spot disks that are overloaded.
Moving file systems around can be a bit tricky. Make
sure you think
about space requirements and disk speed before actually
move. If all else fails, sometimes it just comes down
the money on faster and larger drives.
Configuring the kernel can be a challenge. Only attempt
to alter the
kernel configuration after exhausting the other methods
of NFS performance
tuning. Some NFS requests need information about the
inode of a file,
instead of the blocks of data that actually compose
A problem can exist with the inode table that serves
as a cache for
recently opened files. To control the inode table, set
parameter for Berkeley, or the INODE/NINODE parameter
for System V
(see Figure 3). On your NFS server, the inode table's
is probably not set to its optimal level. As a rule,
add 1.0 for each
diskless client connected to the server in question,
and 0.5 for each
NFS client. Also, consider increasing the size of the
lookup cache. A larger cache should eliminate the need
to read the
disk when reading directories, but again, this is a
with user applications.
This article has given you some ideas about what causes
and how to
solve NFS network performance problems. At the very
least, you should
now know where to look. An NFS network is complex. Solving
does not mean there are not others. Problems can exist
at the hardware,
operating system, and NFS levels. Remember your users
your network. They pay the ultimate price for a network
that is not
functioning optimally. Monitoring NFS network performance
is an ongoing
task, even after the network is analyzed, checked, re-checked,
Your NFS and UNIX documentation contains a detailed
how to configure your particular system. Consult your
when considering changes to NFS or operating system
files. If you are responsible for large networks, make
sure a good
deal of planning goes along with monitoring and configuring
network. Many mistakes can be made during installation
of the network.
By allocating only a few more hours of planning, you
can avoid many
problems and corrections.
Linthicum, David S. "UNIX Facilities for Database
Tuning." Database Programming and Design. January
Linthicum, David S. "Bridging the UNIX Gap."
Database Programming and Design. December 1991.
Loukides, Mike. System Performance Tuning.
Sebastopol, CA: O'Reilly & Associates, 1990.
Stern, Hal. Managing NFS and NIS. Sebastopol,
CA: O'Reilly & Associates, 1991.
About the Author
David Linthicum is currently working with Mobil Oil
Virginia, as a Senior Software Engineer. The author
of over a dozen
articles appearing in several technical publications,
he just completed
a book for Microtrend entitled: Motif Programmer's Library,
in the beginning of 1993. He is also the co-author of
a book entitled:
Introduction to Programming for Que, which was released
of 1992. Dave teaches System Analysis and Design and
at Northern Virginia Community College in Sterling,
has been teaching there since 1987. He can be reached
at (703) 818-9164
or at firstname.lastname@example.org.