Cover V02, I04
Article
Figure 1
Figure 2
Figure 3

jul93.tar


Making the Most of NFS

David S. Linthicum

Introduction

Network performance problems are frustrating for users and network managers alike -- users, because of sludgy response times, and managers, because the source of the problems may be very difficult to identify.

This article presents basic steps for making the most of your NFS system. It first explains how to monitor and understand system performance, then addresses tuning and configuration issues. The material presented here is generic; you'll need to consult the manual for your particular system.

Monitoring the Network Request Load

To develop a clear understanding of the demands on your network, monitor the network's request load. NFS requests are generated when the network performs activities such as loading a file into memory from the server, or making multiple I/O requests on the server drive, which may be part of normal database operations. NFS requests are inherently inconstant in that they randomly generate bursts, and the requests contained in the bursts are usually dissimilar.

A network request load changes widely during the day. For example, you may find the network load is high in the morning as users read their news, and low at lunchtime while most users are out of the office. To get a complete picture, monitor the network over a typical day's operation.

Measuring the Network

The first step in analyzing performance is to benchmark the network, that is, to define the network's current level of performance. The benchmarking technique you select should closely reflect the NFS call rates as well as RPC (Remote Procedure Call) distribution. A benchmarking procedure that provides a measurement under a steady network load may not accurately reflect the real-life situation of your network. Your benchmark should represent your daily network requirements as closely as possible. Always consider your actual client workloads.

When using a benchmark, be careful of the cache effects. The cache on the clients may distort the benchmark results, since repeated reads and writes to the same files will generally be handled by the cache, and will sharply reduce the number of RPC requests made to the network. If possible, try to isolate clients when loading the network. This allows you to narrow down certain areas of the network in the benchmark.

Network performance is measured by monitoring the average response time for the typical user on the network. Divide the number of NFS RPCs made by the time they took to complete. An RPC allows remote communications by defining a standard message format used by high-level protocols. An RPC also provides the mechanism used to define a remote node's operations. You can use either the nfsstat program or a third-party network benchmark program, or you can create your own program.

nfsstat tracks the number of RPC calls made over a period of time. This program comes with most NFS networks (at least on the UNIX side). The nfsstat command displays statistical information pertaining to the network and the RPC interfaces. It is important to note that the RPC interface relies directly on the kernel. nfsstat uses the following syntax:

nfsstat [-csnr]
-c	Displays the client information.
-s	Displays the server information.
-n	Displays the NFS information.
-r	Displays the RPC information.

When nfsstat is used to monitor average response time, you should notice (at least at first) that as the network load is increased the average response time naturally goes up. In fact, the average response in the beginning closely follows the number of NFS requests that are made by users. After awhile, the average response time line should smooth out. Generally, take a sampling over several hours of typically heavy operations to get a good reading. The network should be able to handle the peaks and valleys without a prolonged increase in response time, although when the network load is suddenly increased, the response time may increase until the network has a chance to recover. The server may require additional time to recover from large bursts, and this may leave the response time line high. At this time, users may encounter nasty messages such as RPC timeouts.

The netstat program provides information on such things as UDP socket overflows and IP packets (see Figure 1). Although nfsstat assists in direct NFS diagnostics, netstat is helpful when tuning the server or spotting low level problems.

Places to Check

The performance measurements provide the administrator with information about how the network is currently performing, but it is difficult to identify particular problems caused by certain network components, or "bottlenecks." There is no easy way to spot network bottlenecks, but the following list of potential network problems should at least tell you where to look.

Network Interface (cards) -- The network interface may not be sending or receiving packets due to a failure. Fortunately, when network interface cards go, they usually go quickly. Some do linger on and should be considered as suspect if network performance is suffering.

Bandwidth -- The network may be congested. This slows down transmissions from the clients to the server. You may spot this problem by an over-abundance of RPC timeout errors. NFS networks that depend on corporate, or company-wide networks for transport often show bandwidth problems. In many cases, the only solution to this problem is to take your NFS network off the main network, and place NFS on a network of its own.

Repeaters, Bridges, Routers, and Gateways -- These network components allow your network to connect up with other remote networks or computers, but can potentially cause trouble. Problems with these network components can be easily spotted using common network hardware diagnostic equipment.

Server -- A few problems may be traced back to the server. First, the server may be so bombarded with packets that it can't handle all of them. Second, the CPU on the server may be overloaded. The server is responsible for scheduling an nfsd daemon to perform the network operation. If the server is under-powered, and can not handle the number of daemons required, then network performance is adversely affected.

In addition to the CPU cycles required, the amount of installed server memory may be a problem. If the amount of memory is low, and the server is required to handle a large NFS network load, then the operating system may begin to page, drastically reducing the server performance. Another possible server bottleneck is the disk bandwidth, or the ability to get information off of the disk into the client memory. This problem is compounded by the fact that NFS write operations are synchronous, circumventing caching operations and disk controller ordering.

An NFS network is multidimensional. There are several areas to look at to increase performance or to find a performance problem. A poorly performing network may be caused by a loose network connection, a server that desperately needs a memory upgrade, a router/bridge/gateway problem, and so on. Be absolutely sure that a hardware component is the problem before actually replacing it. If you have extra working hardware around, try swapping out the suspect component and checking network performance again. If there is little or no improvement look elsewhere.

Tuning NFS

If all of the network components check out okay, then it is time to look at the server configuration as the culprit. If network performance monitoring has determined that your NFS performance is unacceptable, and the hardware components all seem to be operating properly, this usually means the NFS server is substandard: not able to handle new or normally scheduled requests.

First, look at the cpu. NFS does not usually constrain the cpu. The nfsd daemon reads and decodes an RPC request and not much else is required from the cpu by NFS. The problem occurs when the cpu handles other operations besides NFS network processing. UNIX is a funny beast. It will do all that is asked of it, with no complaints, but it does place a strain on components. Generally, NFS servers also act as mail servers, print servers, and terminal servers. This may overload the cpu with high priority or interrupt level processes. The NFS daemons run with kernel process priority, therefore they run with a higher priority than other user processes, such as applications. If other kernel level processes are present, the latency to schedule the NFS daemons is greater because the daemons do not get the cpu as soon as required. This latency in scheduling means decreased NFS request performance -- the nfsd daemons are not as responsive to the requests as they should be.

It is also important to note that, since the NFS daemons are kernel processes, they make user process performance (such as database processing) unacceptable. To free up your cpu for NFS processing, make sure all extraneous activities (such as terminal processing, gateways, and bridges) are moved to other cpus on the network, if possible. Administrators find NFS servers convenient nodes on the network to place these devices, but too many kernel level processes do not an effective NFS server make. If your NFS server is performing too many other duties, you may find NFS performance unacceptable. Use sar or vmstat to monitor cpu utilization. Sar also monitors buffer and inode activity, but it does not replace netstat or nfsstat (see Figure 2).

The most significant aspect of NFS server tuning is selecting the number of server daemons to run. This shows a classic trade-off: too many and cpu performance suffers; too few and network performance suffers. The goal is to find a happy medium. As a rule, have two nfsd daemons for each concurrent disk operation. For some systems that run two disks on the same controller and can schedule two disk operations at the same time, four nfsd daemons are recommended. At this point, check UDP socket overflows using netstat (see Figure 1). If you see any UDP socket overflow, add daemons, and check again. Stop adding daemons when socket overflow is nominal.

Next, check network performance. If NFS performance still suffers, begin adding daemons again. Make sure you stop when the server's load average increases without a corresponding network performance increase. Make sure you test your network under normal user load. Benchmarking programs are fine if you have no other choice, but a real-life user load is best. This, again, is a trial and error procedure. Once you find the optimal number of NFS daemons, later you may have to alter this number as users and the user application load increases. It always does.

NFS takes advantage of the server's buffer cache subsystem. As a rule of thumb, you may want to increase number of pages allocated for cache memory from the default. Ten to twenty percent of total available memory is most likely your range, but it depends on the server load. Note that when you allocate memory for cache, you reduce the available memory for user processes. If your server also provides application services, your users won't be happy.

Sometimes the problem with NFS performance is the server's ability to process NFS disk requests. In this case, consider the standard disk tuning factors, such as adequate kernel table sizes, the distribution balance of the disk requests for all disks on the network, as well as the performance of the disk as a device. As a rule, use the fastest disk available on the server, and make sure that the disk is optimally tuned. NFS read and write operations usually can't take full advantage of caching or disk controller optimization, such as one would find in SCSI and BUS Mastering technology.

File system fragmentation can further cause a performance problem. This is caused by large files, such as database files, that are not contiguously allocated on disk. This means the read/write head has to move back and forth over a larger area to gather information held in the file. UNIX systems normally do not come with defragmentation software. Therefore, the only remedy to this problem is to dump the entire file system to tape and reload.

A big directory may also cause a problem with NFS performance since directories are searched linearly during lookup operations. The time required to find a named directory is proportional to the size of the directory and where the name is contained. Therefore, the more names contained in a directory, the longer it takes to search the directory. Also, beware of symbolic links that point to other symbolic links. UNIX resolves each one as another lookup, which takes more time to resolve than a single symlink to an inode.

Some disk problems can be solved by reallocating disk drive load. The administrator must balance the load, usually by giving a heavily used file system it's own (hopefully faster) disk. With a separate disk, requests can be serviced in parallel. Using utilities such as iostat, administrators can spot disks that are overloaded. Moving file systems around can be a bit tricky. Make sure you think about space requirements and disk speed before actually making the move. If all else fails, sometimes it just comes down to spending the money on faster and larger drives.

Configuring the kernel can be a challenge. Only attempt to alter the kernel configuration after exhausting the other methods of NFS performance tuning. Some NFS requests need information about the inode of a file, instead of the blocks of data that actually compose the file.

A problem can exist with the inode table that serves as a cache for recently opened files. To control the inode table, set the MAXUSERS parameter for Berkeley, or the INODE/NINODE parameter for System V (see Figure 3). On your NFS server, the inode table's default value is probably not set to its optimal level. As a rule, add 1.0 for each diskless client connected to the server in question, and 0.5 for each NFS client. Also, consider increasing the size of the directory name lookup cache. A larger cache should eliminate the need to read the disk when reading directories, but again, this is a memory tradeoff with user applications.

Conclusion

This article has given you some ideas about what causes and how to solve NFS network performance problems. At the very least, you should now know where to look. An NFS network is complex. Solving one problem does not mean there are not others. Problems can exist at the hardware, operating system, and NFS levels. Remember your users when analyzing your network. They pay the ultimate price for a network that is not functioning optimally. Monitoring NFS network performance is an ongoing task, even after the network is analyzed, checked, re-checked, and tuned.

Your NFS and UNIX documentation contains a detailed description of how to configure your particular system. Consult your documentation when considering changes to NFS or operating system configuration files. If you are responsible for large networks, make sure a good deal of planning goes along with monitoring and configuring your NFS network. Many mistakes can be made during installation of the network. By allocating only a few more hours of planning, you can avoid many problems and corrections.

References

Linthicum, David S. "UNIX Facilities for Database Tuning." Database Programming and Design. January 1992.

Linthicum, David S. "Bridging the UNIX Gap." Database Programming and Design. December 1991.

Loukides, Mike. System Performance Tuning. Sebastopol, CA: O'Reilly & Associates, 1990.

Stern, Hal. Managing NFS and NIS. Sebastopol, CA: O'Reilly & Associates, 1991.

About the Author

David Linthicum is currently working with Mobil Oil in Fairfax, Virginia, as a Senior Software Engineer. The author of over a dozen articles appearing in several technical publications, he just completed a book for Microtrend entitled: Motif Programmer's Library, due out in the beginning of 1993. He is also the co-author of a book entitled: Introduction to Programming for Que, which was released in December of 1992. Dave teaches System Analysis and Design and Database Design at Northern Virginia Community College in Sterling, Virginia, and has been teaching there since 1987. He can be reached at (703) 818-9164 or at 72740.2016@compuserve.com.