Article

jun2002.tar

Multi-Platform Performance Monitoring on the Cheap

Dale Southard

The monitoring system presented in this article grew out of a simple need for a portable system with which I could remotely monitor performance metrics on UNIX hosts. When I began looking for a solution, I considered many of the available commercial products. The standard xload is easy to understand, but only provides a single metric (load average) and doesn't retain information across invocations. The top command is better, but again provides only a snapshot, not a history trail. Sun's perfmeter and rstatd provide more metrics and the ability to save trails, but are only available under a few architectures. SGI's Performance Co-Pilot can monitor and save an incredible number of metrics, but at the time was only available under IRIX (it has since been ported to Linux as well). Finally, SNMP looked like a future contender, but still suffered from a lack of affordable monitoring packages and security issues on some platforms.

What I really wanted was the ability to collect and save a group of performance metrics and then reduce them to a form that is easy to understand. Ideally, the tools should be portable to a wide range of UNIX flavors. Upon further consideration, I found my needs were simple enough to be met with syslog and some common UNIX utilities.

The original inspiration for the design came from one of syslogd's built-in features, the mark timestamp. Most modern syslog daemons provide a mark function that places a timestamp in the logfile at regular intervals. This is often used to help fix the time of catastrophic system events (such as sudden power loss) that would otherwise provide no log evidence. What limits the usefulness of the standard syslogd mark function is that it provides only a mark indicating that the machine is powered on and running syslogd. In most cases, users and sys admins are interested in monitoring more than the state of machine power and correct syslogd function.

Collecting the Metrics

For this article, I assume that we have a network of six machines at foo.com. One is the central "monhost" where we will be doing the monitoring. The other five are the client hosts that we will monitor. Each client host is running a different operating system (Solaris, IRIX, IRIX64, Linux, and MacOS X). For this example, we will monitor the following metrics on each client host:

Load average of the machine
Amount of free memory in MB
Amount of free swap space in MB

The first step in the process was determining what metrics to monitor and how to obtain them. Getting the load average is trivial. Most variants of SVR4 UNIX (Solaris 2.x, IRIX, etc.) and BSD UNIX (SunOS 4.x, BSD, MacOS X, etc.) include the "uptime" command that includes the system load averages for the past 1, 5, and 15 minutes. Since I was interested in the five-minute load average, a simple awk command is enough to select the appropriate field:

uptime | awk '{gsub(",",""); print $(NF-1)}'

Free memory and swap are more difficult since the commands used to monitor them differ wildly between UNIX flavors. Because the "freemem" and free swap values are related and share the same units, I chose to extract them to a single line of output -- first memory, then swap, separated by white space.

Solaris and IRIX are standard SVR4 variants and provide the sar command for monitoring a variety of performance metrics. For memory and swap, the -r flag will narrow sar's output to the metrics we are interested in. sar lists freemem in pages and free swap in blocks. The default output is the freemem and free swap metrics are pages and disk blocks. sar's notion of a basic block is 512 bytes on all platforms presented here, so dividing by 2048 will convert blocks to MB.

Conversion of freemem pages to MB is OS-dependent. For Solaris, "pagesize" is 8192 bytes, so dividing by 128 gives MB. IRIX comes in two "widths" -- the smaller desktop machines use 32-bit kernels with a pagesize of 4096 bytes; the larger servers use 64-bit kernels with a pagesize of 16384 bytes -- so we will need to divide by 254 or 64, respectively. We will again use awk to filter for our desired metrics and perform the necessary conversions.

Solaris free memory and free swap in MB:

sar -r 1 | awk '{m=int($2/128);s=int($3/2048)} END {print m,s}'

IRIX free memory and free swap in MB:

sar -r 1 | awk '{m=int($2/256);s=int($3/2048)} END {print m,s}'

IRIX64 free memory and free swap in MB:

sar -r 1 | awk '{m=int($2/64);s=int($3/2048)} END {print m,s}'

Linux and MacOS both lack sar, and obtaining the memory and swap information is more difficult. In the case of Linux, the memory information can be accessed through the /proc/meminfo pseudo-file. Since that file presents the output as a series of lines, we will need to use awk's pattern-matching abilities to select the correct line for each of our metrics:

awk '/MemFree/ {m=int($2/1024)} \
/SwapFree/{f=int($2/1024)}\
END {print m,f}' /proc/meminfo

For MacOS X, things are more difficult. OS X uses a dynamic paging system that can create swap files as needed (assuming that the disk has enough free space to accommodate such files). This makes our notion of "free swap space" somewhat bogus since the OS will simply create additional swap files as space is required. Rather than present unrealistic numbers for free swap, we'll just punt and report only the free memory information under Mac OS X. This can be obtained from the output of vm_stat:

vm_stat | awk '/free:/ {gsub("\\.","");print int($3/256)}'

Using syslogd for Transporting Information

Now that we have determined how to extract the metrics we want, the next step is to provide a way to remotely monitor them. Luckily, most UNIX variants provide this capability in the form of syslog. The syslog daemon handles messages according to their priority. In the case of syslog, it means a "facility.level" pair. "Facility" refers to the class of information contained in the message -- common facilities are "kern" for kernel messages, "daemon" for system daemon messages, and "auth" for security messages. Because our performance metrics are a local addition, we will use one of the local facilities. (For these examples, I chose the "local3" facility.)

The level part of the priority refers to how serious the message is, ranging from emerg (meaning the system is unstable), to debug (meaning normal debug-level messages). Since our service will be informational, we will log at the info level.

The first step is to configure our central loghost to save the messages in a file separate from the usual syslog information (in this example, /var/log/perflog). This can be done by adding the following line to the syslog.conf file and then sending a SIGHUP to syslogd:

local3.info     /var/log/perflog

We also need to configure syslog on each client host to send the performance data to the monitoring host. Again, this is done by adding a single line to syslog.conf and sending a SIGHUP to the syslog daemon:

local3.info    @monhost.foo.com

Finally, we should test that the above changes are working. Running the command logger -p local3.info -t TEST hello world on one of the clients should generate a line like the following in the /var/log/perflog file on the monhost:

Jan 25 23:26:51 irixhost TEST: hello world

Note that the syslog entry includes both "timestamp" and "hostname" information, which will be useful later when we parse the perflog file for the metrics we want.

Gathering the Data

With syslogd configured on the hosts, we can now use the command pipelines determined in the first step to extract and send data to the monitoring host. Because we want to monitor the metrics over time, we will execute the collection command from cron on the client hosts. As an example, the following crontab entries could be used to send the load, memory, and swap information from the irixhost every 15 minutes (linebreaks have been added for clarity):

  0,15,30,45 * * * * uptime | awk '{gsub(",",""); print $(NF-1)}' |\
                 logger -p local3.info -t load

  0,15,30,45 * * * * sar -r 1 | awk '{m=int($2/256);s=int($3/2048)}\
                 END {print m,s}' | logger -p local3.info -t memswp

Similar entries should be made in the other client hosts, adjusting the memswp line to incorporate the OS-specific metric collection method previously determined. At this point, we should now be accumulating performance data in the perflog file we configured on the monhost.

Data Reduction

The final step is to take the gathered data and turn it into something understandable by non-technical users. We will use the open source GNUplot program for this task. The first step is to split the data into separate files for each machine and metric. Each file should contain XY data for plotting (where the X data will be the time and date information, and the Y data will be one or more related metrics). Again, we can do this with awk or sed. For example, we can extract the load average data for the irixhost client using a command like the following:

awk '/irixhost load:/ {print $1,$2,$3,$6}' /var/log/perflog >irixhost.load

Or, we can use a simpler sed command to filter out the values we are not interested in:

sed 's/irixhost load://;t;d' /var/log/perflog >irixhost.load

Once we have created one or more XY datafiles, we can plot them using the GNUplot program. I find it useful to do the initial plots using GNUplot's interactive mode. Again using the load average data from the irixhost client as an example, the following GNUplot settings will produce a basic plot that is a good starting point for further customization:

set title "Load Average"
set data style lines
set yrange [0:]
set xdata time
set timefmt "%b %d %H:%M:%S"
set format x "%m/%d %H:%M:%S"
plot "irixhost.load" using 1:4 title "IRIX Workstation"

GNUplot is a very capable program, and entire articles could be devoted to exploring various options. Here are a few suggestions:

GNUplot has extensive online help. When in doubt, try the help command.
The set yrange command can be used to select a range of dates to plot.

For example:

set xrange ["Jan 1 00:00:00":"Feb 1 00:00:00"]

Multiple data sets can be plotted using a single plot command (e.g., plotting a total of four metrics selected from two different data files):

  plot \
    "irixhost.memswp" using 1:4 title "irixhost.foo.com free mem",\
    "irixhost.memswp" using 1:5 title "irixhost.foo.com  free swap",\
    "solarishost.memswp" using 1:4 title "solarishost.foo.com free mem",\
    "solarishost.memswp" using 1:5 title "solarishost.foo.com free swap"

GNUplot supports several output formats including .png and .eps. Use the set terminal command to select a format, and the set output command to choose an output file.
The save and load commands can be used to store the variable settings for later use once you've found a good layout. This is especially useful if you tend to look at the same metrics frequently, because the plot can be updated by simply re-parsing the perflog file and re-running gnuplot.

Where to Go from Here

It's easy to extend this system to metrics beyond those presented here. Almost anything that can be run from cron can be timed or filtered through awk, sed, or Perl to produce a metric that can be sent to the monitoring host. It's also fairly easy to use GNUplot from within a cron script to generate near-real-time performance graphs. Such graphs can even be generated or copied into a directory that is exported by a Web server to provide wider access to the performance data. At one site I even extended this concept to send alpha pages to the sys admin by parsing the perflog file every few minutes and testing metrics against some established minimum/maximum values.

Although the system presented here lacks many of the high-end features found in packages like PCP, I have found it useful on many occasions over the years. Because this method relies on components found on almost all UNIX-like operating systems, I often find it easier to make a couple of syslog.conf and crontab entries than to install a more complex package just to monitor a single metric.

Links

http://www.gnuplot.info/

http://www.sgi.com/software/co-pilot/

http://oss.sgi.com/projects/pcp/

Dale Southard is currently a systems administrator with the Accelerated Strategic Computing Initiative at Lawrence Livermore National Laboratory in Livermore, California. He can be contacted at: dsouth@llnl.gov.