ABCs of Performance Tuning
In the systems world today, the cure-all for most performance-related issues is easy - throw more hardware at it. It's cheaper, quicker, and requires far less expertise. Yet, even with hardware prices dropping, this is not always the most prudent approach. This article will take you through some of the fundamental concepts of performance tuning as well as provide some real-life scenarios that every systems administrator must confront.
Performance Tuning Defined
Although the term performance tuning may, at times, be associated with other duties, it should not take the place of proper systems analysis and right-sizing. Performance tuning, more precisely, means looking at a system during normal working hours, collecting data, and analyzing that data (which is the primary focus of this article). Once data has been collected and organized and you have had time to review the material, you must then verify that the correct resources and the correct amount of resources exist on the system. More specifically, do you have antiquated or inadequate hardware? For example, do you have one 85 MHz processor where two 300 MHz processors are necessary; or six 23 GB disk drives where 34 4-GB disk drives are required? Understanding how hardware components and software applications are related is vital in a mission-critical environment.
Given this, the art of performance tuning requires the ability to locate bottlenecks; yet, it often leads to investigating other variables above and beyond the call of duty. In addition to the hardware-specific analysis ability mentioned above, a certain level of software (application and database) knowledge will enhance your capacity to analyze a system. The better you understand how your system is configured and how the applications and users intend to use it, the more likely you are to get it right. In this article, I will discuss a few items related to performance tuning. Then, I will examine the process of how to monitor and tune a system.
Commercial Analysis Tools
To buy or not to buy? This is a purely dollars and cents question. If uptime has a direct impact on profit and you (or the company you work for) were perceptive enough to budget it in (either into your own cost or that of your customers), and you have a worker with expertise on a systems administration package, then you should consider this option worthwhile. Otherwise, focus your efforts on developing your own set of programs for this purpose. I have seen too many companies that bought this or that package (with bells and whistles and graphs and histograms) only to have it end up in the back of some documentation closet.
Where To Tune First
Have you ever been in a situation where a system's performance deteriorated dramatically over a brief period? On occasion, I have seen people make hasty decisions under these circumstances. They decide to buy more hardware and are surprised when they find themselves x number of dollars further into their budget but somehow still in the same precarious position. Do not accept that one possible solution will get you out of a particular mess without hard evidence and long, careful evaluation. Remember the saying - a successful project is eighty percent planning and twenty percent execution.
When tuning, remember that there are numerous areas to consider. In general, the area where an administrator can expect to get the most return (without any system HW or SW changes) is application tuning; the second is optimizing the database; the third is server and/or network hardware; and the fourth is modifying operating system parameters. The immediate aim here is to focus on overall system efficiency and determine how the subsystems are performing.
What Makes Up A System?
Let's review the basics of systems architecture, with an eye toward the elements that we can tune. A client/server environment is made up of a user community, software, and hardware. The operating system (OS) is there to bridge the gap. The OS is broken into two major components: the kernel and the shell. The shell provides the user community with an interface to access the software and hardware resources. Subsequently, the kernel is there to service these user-level requests, in addition to the internal programs of its own subsystems. Four basic subsystems make up the kernel: Process Management, Memory Management, I/O Management, and File Management. These are separate entities that work together to service a user and a user's programs. Each plays a significant role in distributing mail, extracting data and passing it on to a Web browser, compiling a programmer's piece of code, or crunching numbers for a point of sale (POS) system. When analyzing a system, the areas that can affect system performance are disk I/O, CPU, memory, and network I/O.
I place disk I/O at the top of the list for a good reason - it is the number one culprit of a system that has gone south. iostat and sar (-d) are the utilities to use when looking at disk I/O. Further, there is a set of guidelines to follow when monitoring I/O performance. The first thing to consider is how busy a disk is - anything over 15% busy warrants concern, anything over 30% requires fixing. Note that temporary spikes are acceptable, but if these numbers are consistently high, then you must look for solutions. Second, keep a close watch on service time (svc_t). This is the time (in milliseconds) taken to service an I/O request to a particular drive (this includes time spent waiting in line for other requests to be processed). If this number stays above 40, then you need to be concerned and take corrective action. Listing 1 shows some actual numbers. At first glance, this system appears okay. The figures under the percent busy column are acceptable; however, some load balancing could be done to improve performance.
In this listing, sd1, sd2, and sd3 are external SCSI disks on controller 1. sd0 is the internal disk. Note that sd2 and sd3 have little to no activity. These statistics were run on a small UltraSPARC workstation functioning as a mail and DNS server. Relocating the more active filesystems onto these two drives would improve the overall I/O load. By doing this, you would also gain because disk I/O traffic will be distributed more efficiently across controllers. Do not underestimate the importance of load balancing when considering disk I/O performance. Additional things to consider relevant to this area are disk architecture, file access patterns, location of data on drives, and striping when using volume management software.
The OS is a resource manager for the system; it is synonymous to a policeman monitoring overall system activity. The CPU is the most significant part of the OS in that it provides the necessary processing power. It is responsible for servicing kernel-level requests just as the kernel is responsible for servicing user-level requests. All processing is dependent upon available processor cycles. To measure the performance of CPU(s), vmstat, sar (-uq), and mpstat are used. The primary item to focus on is the runnable process queue length.
Let's look at vmstat's output (pulled from a log file that captures vmstat 20 4) shown in Listing 2.
One thing you need to know is how many CPUs are in the system. mpstat is the easiest way to determine whether you have a multiprocessor box. The rule of thumb when determining what is acceptable CPU performance is as follows: take the runnable processes (the r column) and divide by n (total number of CPUs) that is, x = r / n. These statistics were taken from a large database server with six processors. The average runnable queue length from the figures shown is 21. Twenty-one divided by 6 equals 3.5 processes per CPU. Up to 5 processes per CPU is considered the maximum to be allotted during busy periods. And, as always, if this number is constantly at this upper echelon, then corrective alternatives should be considered. On the other hand, if you have a system where there are never any queued processes and the CPU idle column is predominantly in the 90s, the CPU power is underutilized. Perhaps you have another system that could benefit from some of these CPUs.
An additional area to watch is the breakdown of CPU utilization. Ideally, it should be 85% user (us) and 15% system (sy). User CPU utilization refers to time the CPU spent processing user instructions. System CPU utilization signifies CPU time attributed to the kernel while servicing a process. For further details, I recommend O'Reilly's "swordfish book" (System Performance Tuning by Mike Loukides, O'Reilly and Associates).
Every program requires physical memory in order to run. Physical memory is broken up into pages. For example, a workstation with 32 MB of memory has 7,768 8k pages. On Solaris, the command pagesize will display the size of a page of memory on the system, as it can vary from platform to platform. I don't know too many people that think of memory in these terms. However, considering the way memory is reserved, displaced, scanned, and reclaimed, it helps in acquiring a system's perspective of memory management. If you run a ps -l, you will see the SZ column, which lists the number of pages for your current shell, as shown in Listing 3.
Although this is beyond the scope of this article, I feel inclined to mention that if you add up all the values under the SZ column from ps -el, you will get a number larger than the total number of physical pages on the system. This is due to virtual memory and the way the kernel manages the virtual address space of a process. It is able to offload certain pages of inactive processes to a backing device (swap).
Every operating system has a strategy for managing its memory resources. There is a tunable kernel parameter that dictates at what point the system will start trying to reclaim pages of memory. Once this point is reached, systems begin a process called paging. The Solaris OS will attempt to keep a minimum amount of memory free (the free list). However, once the available memory drops below a set mark (lotsfree), the page daemon is started. This daemon monitors processes, trying to reclaim inactive pages to the free list. There is a similar low-water mark (desfree) that dictates when paging is abandoned and swapping begins. Swapping takes entire processes out of main memory and copies them off to disk (swap). Many systems are designed to handle paging fairly efficiently; however, when excessive swapping takes place, system response can drop through the floor.
An important thing to look for when monitoring memory utilization is the rate at which pages are being scanned. Let's look at the vmstat output shown in Listing 4.
Each block of this output is half an hour apart. Once the page daemon starts, it is able to immediately page out (po) a number of pages. The scan rate (sr) column is the key to pinpointing memory-related issues. If it (sr) stays at 0, then you probably have too much RAM on your system. If it drifts between 0 and 200, you're in good shape. Once the page daemon starts scanning more than 200 pages/second, you need to keep a close watch because anything over 300 is grounds for increasing memory. The w column above shows the number of swapped processes. Although this (swapping) may look excessive, it can actually be healthy if the processes were swapped due to inactivity. Under the memory section, free indicates amount (in kilobytes) available on the free list, and swap displays amount of swap (in KB) currently available. Tracking swap sizes will vary from system to system, but as a general rule it should not consistently fall below 10 to 20 MB.
The network is often thought of as "noise" in the performance-measuring spectrum. That is because it can be a difficult thing to get your arms around without extensive training, experience, and utilities. Each OS has various utilities for tracking packets (AIX's iptrace, Solaris's snoop); additionally, there are numerous software packages that do network analyzing. (The November, 1997 Sys Admin contains an excellent article on network monitoring by Jonathan Feldman.) For purposes of this article, I will just touch on netstat and point out the importance of monitoring collisions on your network. To determine the percentage of collisions, multiply the number of collisions by 100 and divide that by the number of outgoing packets (Opkts). The aim is to keep the number of collisions below five percent. For example, if your netstat output shows 38421 collisions among a total of 1209455 packets, then 38421 * 100 / 1209455 = 3.17% collisions.
Develop A Strategy
Now that you have gained a basic understanding of which commands to use and how to interpret their output, it's important to never find yourself glued to the console tearing through these figures - at least, not if you're like me and try to avoid crisis mode at all costs. Like any systems administrator, I side with a proactive approach rather than a reactive one. Let's look at the steps necessary to begin taking a proactive approach.
First, prepare a filesystem (or space within an existing one); 50-100MB will suffice. Then, create a directory structure and begin logging. It's critical to establish a baseline, and the sooner you begin collecting statistics, the sooner you'll acquire that invaluable baseline. Inevitably, when the need arises to look at performance, you will hear, "why is it doing this now? It never did this before." Oh, how these words can haunt you if you don't have hard evidence to map trends.
There are choices when it comes to logging system activity. You can turn on system accounting (sa) and use sar to report against these numbers. I have always been more comfortable with the various stat commands. Listing 5 shows scripts I have placed in my cron table. The frequency at which these are run depends upon how critical the box is. I have some systems for which I run them every half hour 24-hours a day, and others for which they run every 2 hours just during business hours. It's a good idea to distribute the jobs throughout the hour, so you don't hit the system with too many processes at once. Even on systems where the scripts are run every half hour, the cost is less than five percent.
There are other issues to address when developing performance monitoring strategies. To define the workload (fixed or random) for a system, you must discuss and document all of the jobs with the appropriate people (or departments) involved. This is a necessary chore if you hope to stay on top of a busy system. It will, however, also provide a job control log that can track jobs and processes, particularly those that may need to be scheduled off-hours. Consider taking snapshots (ps -elf) of all processes during peak hours of the day to pinpoint the jobs that are consuming the most resources.
Clarify what is considered acceptable performance with the same group of people that provided information on the workload. Keep an open line of communication with these folks either through daily interaction or weekly status meetings. Someone is destined to call and ask, "what's wrong with the system, today? It's taking forever to pull up screens." The more information you have regarding what's running on the system, the easier it will be for you to address these issues. It may turn out to be something simple. Perhaps upper management requested some extensive ad hoc report or the applications group has a high-priority job. However, it may also turn out to be a tip for you to review the statistics you've been collecting on that box. Whatever the situation, you can now rest assured that you'll be able to track the problem because you have your logs to consult.
There are also various approaches you can take when deciding what to do with the system activity logs. Listing 6 is a Perl script that parses through the disk I/O logs and reports any activity beyond the high water marks given. I have migrated to this from an outdated awk script. Whether you implement these types of reporting scripts as cron jobs to flag heavy activity or as daily mail messages, the options are plentiful.
Let's backtrack to the earlier scenario where a user called concerning system response time. After reviewing the system, suppose you find that for the past two weeks the CPU run queue has steadily increased. It's important from a job control perspective to resolve what additional load caused this. Yet, now that you've identified the bottleneck - the system is CPU-bound - you must put together options to remove this bottleneck. The mentality of not making more than one change at a time to a system certainly holds true with tuning. Carefully plan and track any alterations to the system. So, the loop goes: find the bottleneck, remove it, is performance satisfactory? If the answer is yes, break the loop; otherwise, find the bottleneck, and so on. If you add CPUs to this system, it may cure the excessive runnable-process queue, but it could also cause disk I/O issues to surface. Be prepared for these types of results. If you make some configuration change to address one group of users' problems, be wary of the fact that this change may affect other things on the system.
About the Author
Joe Beck has been doing Systems Administration for six years. He currently works as an independent contractor for Sun Microsystems, on a project for the State of New Jersey (Dept. of Labor, tax redesign) working with Sun Enterprise Cluster and Oracle Parallel Server. He can be reached at firstname.lastname@example.org.