Making the Most of NFS
David S. Linthicum
Introduction
Network performance problems are frustrating for users
and network
managers alike -- users, because of sludgy response
times, and
managers, because the source of the problems may be
very difficult
to identify.
This article presents basic steps for making the most
of your NFS
system. It first explains how to monitor and understand
system performance,
then addresses tuning and configuration issues. The
material presented
here is generic; you'll need to consult the manual for
your particular
system.
Monitoring the Network Request Load
To develop a clear understanding of the demands on your
network, monitor
the network's request load. NFS requests are generated
when the network
performs activities such as loading a file into memory
from the server,
or making multiple I/O requests on the server drive,
which may be
part of normal database operations. NFS requests are
inherently inconstant
in that they randomly generate bursts, and the requests
contained
in the bursts are usually dissimilar.
A network request load changes widely during the day.
For example,
you may find the network load is high in the morning
as users read
their news, and low at lunchtime while most users are
out of the office.
To get a complete picture, monitor the network over
a typical day's
operation.
Measuring the Network
The first step in analyzing performance is to benchmark
the network,
that is, to define the network's current level of performance.
The
benchmarking technique you select should closely reflect
the NFS call
rates as well as RPC (Remote Procedure Call) distribution.
A benchmarking
procedure that provides a measurement under a steady
network load
may not accurately reflect the real-life situation of
your network.
Your benchmark should represent your daily network requirements
as
closely as possible. Always consider your actual client
workloads.
When using a benchmark, be careful of the cache effects.
The cache
on the clients may distort the benchmark results, since
repeated reads
and writes to the same files will generally be handled
by the cache,
and will sharply reduce the number of RPC requests made
to the network.
If possible, try to isolate clients when loading the
network. This
allows you to narrow down certain areas of the network
in the benchmark.
Network performance is measured by monitoring the average
response
time for the typical user on the network. Divide the
number of NFS
RPCs made by the time they took to complete. An RPC
allows remote
communications by defining a standard message format
used by high-level
protocols. An RPC also provides the mechanism used to
define a remote
node's operations. You can use either the nfsstat program
or
a third-party network benchmark program, or you can
create your own
program.
nfsstat tracks the number of RPC calls made over a period
of
time. This program comes with most NFS networks (at
least on the UNIX
side). The nfsstat command displays statistical information
pertaining to the network and the RPC interfaces. It
is important
to note that the RPC interface relies directly on the
kernel. nfsstat
uses the following syntax:
nfsstat [-csnr]
-c Displays the client information.
-s Displays the server information.
-n Displays the NFS information.
-r Displays the RPC information.
When nfsstat is used to monitor average response
time, you should notice (at least at first) that as
the network load
is increased the average response time naturally goes
up. In fact,
the average response in the beginning closely follows
the number of
NFS requests that are made by users. After awhile, the
average response
time line should smooth out. Generally, take a sampling
over several
hours of typically heavy operations to get a good reading.
The network
should be able to handle the peaks and valleys without
a prolonged
increase in response time, although when the network
load is suddenly
increased, the response time may increase until the
network has a
chance to recover. The server may require additional
time to recover
from large bursts, and this may leave the response time
line high.
At this time, users may encounter nasty messages such
as RPC timeouts.
The netstat program provides information on such things
as
UDP socket overflows and IP packets (see Figure 1).
Although nfsstat
assists in direct NFS diagnostics, netstat is helpful
when
tuning the server or spotting low level problems.
Places to Check
The performance measurements provide the administrator
with information
about how the network is currently performing, but it
is difficult
to identify particular problems caused by certain network
components,
or "bottlenecks." There is no easy way to
spot network bottlenecks,
but the following list of potential network problems
should at least
tell you where to look.
Network Interface (cards) -- The network interface may
not
be sending or receiving packets due to a failure. Fortunately,
when
network interface cards go, they usually go quickly.
Some do linger
on and should be considered as suspect if network performance
is suffering.
Bandwidth -- The network may be congested. This slows
down
transmissions from the clients to the server. You may
spot this problem
by an over-abundance of RPC timeout errors. NFS networks
that depend
on corporate, or company-wide networks for transport
often show bandwidth
problems. In many cases, the only solution to this problem
is to take
your NFS network off the main network, and place NFS
on a network
of its own.
Repeaters, Bridges, Routers, and Gateways -- These network
components allow your network to connect up with other
remote networks
or computers, but can potentially cause trouble. Problems
with these
network components can be easily spotted using common
network hardware
diagnostic equipment.
Server -- A few problems may be traced back to the server.
First, the server may be so bombarded with packets that
it can't handle
all of them. Second, the CPU on the server may be overloaded.
The
server is responsible for scheduling an nfsd daemon
to perform the
network operation. If the server is under-powered, and
can not handle
the number of daemons required, then network performance
is adversely
affected.
In addition to the CPU cycles required, the amount of
installed server
memory may be a problem. If the amount of memory is
low, and the server
is required to handle a large NFS network load, then
the operating
system may begin to page, drastically reducing the server
performance.
Another possible server bottleneck is the disk bandwidth,
or the ability
to get information off of the disk into the client memory.
This problem
is compounded by the fact that NFS write operations
are synchronous,
circumventing caching operations and disk controller
ordering.
An NFS network is multidimensional. There are several
areas to look
at to increase performance or to find a performance
problem. A poorly
performing network may be caused by a loose network
connection, a
server that desperately needs a memory upgrade, a router/bridge/gateway
problem, and so on. Be absolutely sure that a hardware
component is
the problem before actually replacing it. If you have
extra working
hardware around, try swapping out the suspect component
and checking
network performance again. If there is little or no
improvement look
elsewhere.
Tuning NFS
If all of the network components check out okay, then
it is time to
look at the server configuration as the culprit. If
network performance
monitoring has determined that your NFS performance
is unacceptable,
and the hardware components all seem to be operating
properly, this
usually means the NFS server is substandard: not able
to handle new
or normally scheduled requests.
First, look at the cpu. NFS does not usually constrain
the cpu. The
nfsd daemon reads and decodes an RPC request and not
much else
is required from the cpu by NFS. The problem occurs
when the cpu handles
other operations besides NFS network processing. UNIX
is a funny beast.
It will do all that is asked of it, with no complaints,
but it does
place a strain on components. Generally, NFS servers
also act as mail
servers, print servers, and terminal servers. This may
overload the
cpu with high priority or interrupt level processes.
The NFS daemons
run with kernel process priority, therefore they run
with a higher
priority than other user processes, such as applications.
If other
kernel level processes are present, the latency to schedule
the NFS
daemons is greater because the daemons do not get the
cpu as soon
as required. This latency in scheduling means decreased
NFS request
performance -- the nfsd daemons are not as responsive
to
the requests as they should be.
It is also important to note that, since the NFS daemons
are kernel
processes, they make user process performance (such
as database processing)
unacceptable. To free up your cpu for NFS processing,
make sure all
extraneous activities (such as terminal processing,
gateways, and
bridges) are moved to other cpus on the network, if
possible. Administrators
find NFS servers convenient nodes on the network to
place these devices,
but too many kernel level processes do not an effective
NFS server
make. If your NFS server is performing too many other
duties, you
may find NFS performance unacceptable. Use sar or vmstat
to monitor
cpu utilization. Sar also monitors buffer and inode
activity, but
it does not replace netstat or nfsstat (see Figure 2).
The most significant aspect of NFS server tuning is
selecting the
number of server daemons to run. This shows a classic
trade-off: too
many and cpu performance suffers; too few and network
performance
suffers. The goal is to find a happy medium. As a rule,
have two nfsd
daemons for each concurrent disk operation. For some
systems that
run two disks on the same controller and can schedule
two disk operations
at the same time, four nfsd daemons are recommended.
At this
point, check UDP socket overflows using netstat (see
Figure 1).
If you see any UDP socket overflow, add daemons,
and check again.
Stop adding daemons when socket overflow is nominal.
Next, check network performance. If NFS performance
still suffers,
begin adding daemons again. Make sure you stop when
the server's load
average increases without a corresponding network performance
increase.
Make sure you test your network under normal user load.
Benchmarking
programs are fine if you have no other choice, but a
real-life user
load is best. This, again, is a trial and error procedure.
Once you
find the optimal number of NFS daemons, later you may
have to alter
this number as users and the user application load increases.
It always
does.
NFS takes advantage of the server's buffer cache subsystem.
As a rule
of thumb, you may want to increase number of pages allocated
for cache
memory from the default. Ten to twenty percent of total
available
memory is most likely your range, but it depends on
the server load.
Note that when you allocate memory for cache, you reduce
the available
memory for user processes. If your server also provides
application
services, your users won't be happy.
Sometimes the problem with NFS performance is the server's
ability
to process NFS disk requests. In this case, consider
the standard
disk tuning factors, such as adequate kernel table sizes,
the distribution
balance of the disk requests for all disks on the network,
as well
as the performance of the disk as a device. As a rule,
use the fastest
disk available on the server, and make sure that the
disk is optimally
tuned. NFS read and write operations usually can't take
full advantage
of caching or disk controller optimization, such as
one would find
in SCSI and BUS Mastering technology.
File system fragmentation can further cause a performance
problem.
This is caused by large files, such as database files,
that are not
contiguously allocated on disk. This means the read/write
head has
to move back and forth over a larger area to gather
information held
in the file. UNIX systems normally do not come with
defragmentation
software. Therefore, the only remedy to this problem
is to dump the
entire file system to tape and reload.
A big directory may also cause a problem with NFS performance
since
directories are searched linearly during lookup operations.
The time
required to find a named directory is proportional to
the size of
the directory and where the name is contained. Therefore,
the more
names contained in a directory, the longer it takes
to search the
directory. Also, beware of symbolic links that point
to other symbolic
links. UNIX resolves each one as another lookup, which
takes more
time to resolve than a single symlink to an inode.
Some disk problems can be solved by reallocating disk
drive load.
The administrator must balance the load, usually by
giving a heavily
used file system it's own (hopefully faster) disk. With
a separate
disk, requests can be serviced in parallel. Using utilities
such as
iostat, administrators can spot disks that are overloaded.
Moving file systems around can be a bit tricky. Make
sure you think
about space requirements and disk speed before actually
making the
move. If all else fails, sometimes it just comes down
to spending
the money on faster and larger drives.
Configuring the kernel can be a challenge. Only attempt
to alter the
kernel configuration after exhausting the other methods
of NFS performance
tuning. Some NFS requests need information about the
inode of a file,
instead of the blocks of data that actually compose
the file.
A problem can exist with the inode table that serves
as a cache for
recently opened files. To control the inode table, set
the MAXUSERS
parameter for Berkeley, or the INODE/NINODE parameter
for System V
(see Figure 3). On your NFS server, the inode table's
default value
is probably not set to its optimal level. As a rule,
add 1.0 for each
diskless client connected to the server in question,
and 0.5 for each
NFS client. Also, consider increasing the size of the
directory name
lookup cache. A larger cache should eliminate the need
to read the
disk when reading directories, but again, this is a
memory tradeoff
with user applications.
Conclusion
This article has given you some ideas about what causes
and how to
solve NFS network performance problems. At the very
least, you should
now know where to look. An NFS network is complex. Solving
one problem
does not mean there are not others. Problems can exist
at the hardware,
operating system, and NFS levels. Remember your users
when analyzing
your network. They pay the ultimate price for a network
that is not
functioning optimally. Monitoring NFS network performance
is an ongoing
task, even after the network is analyzed, checked, re-checked,
and
tuned.
Your NFS and UNIX documentation contains a detailed
description of
how to configure your particular system. Consult your
documentation
when considering changes to NFS or operating system
configuration
files. If you are responsible for large networks, make
sure a good
deal of planning goes along with monitoring and configuring
your NFS
network. Many mistakes can be made during installation
of the network.
By allocating only a few more hours of planning, you
can avoid many
problems and corrections.
References
Linthicum, David S. "UNIX Facilities for Database
Tuning." Database Programming and Design. January
1992.
Linthicum, David S. "Bridging the UNIX Gap."
Database Programming and Design. December 1991.
Loukides, Mike. System Performance Tuning.
Sebastopol, CA: O'Reilly & Associates, 1990.
Stern, Hal. Managing NFS and NIS. Sebastopol,
CA: O'Reilly & Associates, 1991.
About the Author
David Linthicum is currently working with Mobil Oil
in Fairfax,
Virginia, as a Senior Software Engineer. The author
of over a dozen
articles appearing in several technical publications,
he just completed
a book for Microtrend entitled: Motif Programmer's Library,
due out
in the beginning of 1993. He is also the co-author of
a book entitled:
Introduction to Programming for Que, which was released
in December
of 1992. Dave teaches System Analysis and Design and
Database Design
at Northern Virginia Community College in Sterling,
Virginia, and
has been teaching there since 1987. He can be reached
at (703) 818-9164
or at 72740.2016@compuserve.com.
|