Cover V05, I12
Article
Listing 1
Listing 2
Listing 3
Listing 4

dec96.tar


Performance Management in an RDBMS Environment

Richard S. Smith

Performance management traditionally centers around tasks performed by seasoned system administrators (SAs) to keep one or a collection of systems operating at their best. In a database environment, however, good performance management becomes a team effort, often across organizational lines, between SAs and database administrators (DBAs). Similarly, the tool set used to achieve good performance management is a combination of UNIX tools and those provided by the database management system (DBMS). This article explores this combination within the context of IBM RS/6000 hardware running AIX and the Progress RDBMS. However, the experience gained and lessons learned are applicable in any large client-server environment.

Case Study

The particular site involved in this example uses an RS/6000 R30 as the server machine. It's basically the top end of the RS/6000 line with 8 PowerPC 601 processors (soon to be 604s) and 1.5 Gb of RAM. Two more RS/6000s are used as front-end machines, which handle all the login sessions from approximately 850 users working on terminals and PCs, while keeping the server free to do its job. The two client machines are linked to the server using a high-speed FDDI ring, and user logins are split between the two machines using a load-balancing nameserver. The database itself consumes more than 12 Gb of disk space. The software environment includes AIX 4.1 and Version 7 of the PROGRESS RDBMS.

Although PROGRESS performance tuning could be the subject of another entire article, I do want to explain how I measured the load placed on the server by the PROGRESS database in terms of CPU, memory, and disk I/O performance.

In simple terms, the RDBMS software is just the broker between users requesting and changing data and the actual physical database that resides on disk. Therefore, database performance is considered "optimal" when as much data as possible is being read and written to and from the disk. In other words, my job as a performance manager is to maximize throughput. This may sound simple but can get complicated, in practice, when problems on the hardware, operating system, database, and application layers conspire to degrade throughput and create a user perception of sluggishness.

From the perspective of CPU performance, the thing to watch for is that the CPU is never 100% utilized. If it is, that means there is a bottleneck in the process of getting data on and off the disk. If memory becomes unavailable and the system starts to page excessively, this must also be addressed. The best situation is to have excess capacity in terms of both CPU and memory. The raw speed of the disk and I/O hardware will dictate the maximum throughput of the system.

Tools of the Trade

For the most part, the vmstat and iostat utilities give sufficient information to determine when memory or processor upgrades are needed. AIX also provides xmperf, which gives a graphical view of CPU usage, I/O activity, and so forth.

The DBMS, PROGRESS in this case, can also provide useful performance statistics, although not always in a format that can be easily correlated with the other system data. The PROGRESS monitoring utility (promon), was not designed to be interactive and does not generate tabular data like other UNIX utility programs. (See Listing 1.)

Expect as a Performance Tool

Don Libes' Expect package can be used to work around deficiencies in the reporting capability of the DBMS. By writing Expect scripts that spawn promon processes and capture the data, you can extract the needed information and easily correlate it with CPU data from vmstat and I/O data from iostat. (See Listing 2 and Listing 3.)

As you can see from Listing 2, I keep track of the user count, as well as the count of physical and logical writes to the database. The user count is important, because it indicates the demand for system resources. The logical writes are a measure of the load placed on the Progress DBMS by the application, and the physical write count is a measure of the load that the DBMS is in turn passing on to the Unix I/O subsystem.

It is difficult to give simple descriptions of how I use these metrics, but as an example, I expect to see a linear relationship between physical database writes and system-level writes as revealed by iostat. Furthermore, if the ratio of logical to physical writes is low (close to 1:1 instead of 2:1 or higher), that suggests either a problem with the index structure (as defined in the database schema) or with application code performing reads and writes that don't "hit the index." This is a common mistake made by programmers, but can easily be corrected.

Another example for use of these statistics concerns the political side. Often, the user count revealed by my Expect scripts is higher than the user count that the managers have told the systems administration staff to allow for in budgeting, capacity planning, and routine tuning of the system.

I dump this data into a graphics program that gives me a graph of the user load over a day. I can then make an overhead transparency and take that into planning meetings. Thus, I can defend myself if accused of being negligent in allowing the system to bog down and can demonstrate that the system needs to be enhanced to handle the increased load placed upon it by the users.

Expect, besides being used as a front-end for the promon utility, also came in handy when I was asked to perform load tests in preparation for system software upgrades and other major changes to the application environment. I used Expect to write scripts simulating users logging into the system, accessing commonly used parts of the application, and entering simple transactions. I was then able to take baseline measurements of the system load, make a single change, and then measure again to see the results.

This was a real lifesaver when planning for major upgrades to the PROGRESS software or evaluating products from IBM or other third-party vendors that changed the system environment. This information enabled me to recommend some changes and prevent some others that would have been disastrous.

Expect also gave me the ability to measure the response time of the application. This provided invaluable insight as to when users complained about the system being slow. The ability to objectively measure how long users were actually waiting helped me to determine when a system slowdown represented a true database or system tuning problem. (See Listing 4.)

Conclusions

In summary, the required skills to be an effective SA/DBA are a superset of the "traditional" SA skill set. Obviously it is necessary to be an expert on the DBMS in terms of the demands placed upon Unix and the system hardware. It is also necessary to be knowledgeable about the application and user sides of the DBMS, so that you are able to anticipate and resolve issues that involve the programmers and the user base effectively and diplomatically.

About the author

Richard Smith is a system and database administrator for [800]Sys.Admin, a division of San Francisco-based Capital Technologies, Inc. He also founded the Usenet newsgroup comp.databases.progress and maintains a PROGRESS FAQ at: http://www.captech.com/~rss/progress-faq.html and can be reached via email at: rss@captech.com.