Performance Management in an RDBMS Environment
Richard S. Smith
Performance management traditionally centers around
tasks performed by
seasoned system administrators (SAs) to keep one or
a collection of
systems operating at their best. In a database environment,
however,
good performance management becomes a team effort, often
across
organizational lines, between SAs and database administrators
(DBAs).
Similarly, the tool set used to achieve good performance
management is a
combination of UNIX tools and those provided by the
database management
system (DBMS). This article explores this combination
within the context
of IBM RS/6000 hardware running AIX and the Progress
RDBMS. However, the
experience gained and lessons learned are applicable
in any large
client-server environment.
Case Study
The particular site involved in this example uses an
RS/6000 R30 as the
server machine. It's basically the top end of the RS/6000
line with 8
PowerPC 601 processors (soon to be 604s) and 1.5 Gb
of RAM. Two more
RS/6000s are used as front-end machines, which handle
all the login
sessions from approximately 850 users working on terminals
and PCs,
while keeping the server free to do its job. The two
client machines are
linked to the server using a high-speed FDDI ring, and
user logins are
split between the two machines using a load-balancing
nameserver. The
database itself consumes more than 12 Gb of disk space.
The software
environment includes AIX 4.1 and Version 7 of the PROGRESS
RDBMS.
Although PROGRESS performance tuning could be the subject
of another
entire article, I do want to explain how I measured
the load placed on
the server by the PROGRESS database in terms of CPU,
memory, and disk
I/O performance.
In simple terms, the RDBMS software is just the broker
between users
requesting and changing data and the actual physical
database that
resides on disk. Therefore, database performance is
considered "optimal"
when as much data as possible is being read and written
to and from the
disk. In other words, my job as a performance manager
is to maximize
throughput. This may sound simple but can get complicated,
in practice,
when problems on the hardware, operating system, database,
and
application layers conspire to degrade throughput and
create a user
perception of sluggishness.
From the perspective of CPU performance, the thing to
watch for is that
the CPU is never 100% utilized. If it is, that means
there is a
bottleneck in the process of getting data on and off
the disk. If memory
becomes unavailable and the system starts to page excessively,
this must
also be addressed. The best situation is to have excess
capacity in
terms of both CPU and memory. The raw speed of the disk
and I/O hardware
will dictate the maximum throughput of the system.
Tools of the Trade
For the most part, the vmstat and iostat utilities give
sufficient
information to determine when memory or processor upgrades
are needed.
AIX also provides xmperf, which gives a graphical view
of CPU usage, I/O
activity, and so forth.
The DBMS, PROGRESS in this case, can also provide useful
performance
statistics, although not always in a format that can
be easily
correlated with the other system data. The PROGRESS
monitoring utility
(promon), was not designed to be interactive and does
not generate
tabular data like other UNIX utility programs. (See
Listing 1.)
Expect as a Performance Tool
Don Libes' Expect package can be used to work around
deficiencies in the
reporting capability of the DBMS. By writing Expect
scripts that spawn
promon processes and capture the data, you can extract
the needed
information and easily correlate it with CPU data from
vmstat and I/O
data from iostat. (See Listing 2 and Listing
3.)
As you can see from Listing 2, I keep track of the user
count, as well
as the count of physical and logical writes to the database.
The user
count is important, because it indicates the demand
for system
resources. The logical writes are a measure of the load
placed on the
Progress DBMS by the application, and the physical write
count is a
measure of the load that the DBMS is in turn passing
on to the Unix I/O
subsystem.
It is difficult to give simple descriptions of how I
use these metrics,
but as an example, I expect to see a linear relationship
between
physical database writes and system-level writes as
revealed by iostat.
Furthermore, if the ratio of logical to physical writes
is low (close to
1:1 instead of 2:1 or higher), that suggests either
a problem with the
index structure (as defined in the database schema)
or with application
code performing reads and writes that don't "hit
the index." This is a
common mistake made by programmers, but can easily be
corrected.
Another example for use of these statistics concerns
the political side.
Often, the user count revealed by my Expect scripts
is higher than the
user count that the managers have told the systems administration
staff
to allow for in budgeting, capacity planning, and routine
tuning of the
system.
I dump this data into a graphics program that gives
me a graph of the
user load over a day. I can then make an overhead transparency
and take
that into planning meetings. Thus, I can defend myself
if accused of
being negligent in allowing the system to bog down and
can demonstrate
that the system needs to be enhanced to handle the increased
load placed
upon it by the users.
Expect, besides being used as a front-end for the promon
utility, also
came in handy when I was asked to perform load tests
in preparation for
system software upgrades and other major changes to
the application
environment. I used Expect to write scripts simulating
users logging
into the system, accessing commonly used parts of the
application, and
entering simple transactions. I was then able to take
baseline
measurements of the system load, make a single change,
and then measure
again to see the results.
This was a real lifesaver when planning for major upgrades
to the
PROGRESS software or evaluating products from IBM or
other third-party
vendors that changed the system environment. This information
enabled me
to recommend some changes and prevent some others that
would have been
disastrous.
Expect also gave me the ability to measure the response
time of the
application. This provided invaluable insight as to
when users
complained about the system being slow. The ability
to objectively
measure how long users were actually waiting helped
me to determine when
a system slowdown represented a true database or system
tuning problem.
(See Listing 4.)
Conclusions
In summary, the required skills to be an effective SA/DBA
are a superset
of the "traditional" SA skill set. Obviously
it is necessary to be an
expert on the DBMS in terms of the demands placed upon
Unix and the
system hardware. It is also necessary to be knowledgeable
about the
application and user sides of the DBMS, so that you
are able to
anticipate and resolve issues that involve the programmers
and the user
base effectively and diplomatically.
About the author
Richard Smith is a system and database administrator
for [800]Sys.Admin,
a division of San Francisco-based Capital Technologies,
Inc. He also
founded the Usenet newsgroup comp.databases.progress
and maintains a
PROGRESS FAQ at: http://www.captech.com/~rss/progress-faq.html
and can
be reached via email at: rss@captech.com.
|