Getting the Info -- sar
[Author's note: many of the system terms like bread/bwrite,
were presented and discussed in Sys Admin 1.3 (Sept.
"Getting the Info -- u386mon".]
To properly manage system performance, system managers
planners must monitor the usage of the various system
of the operating environment. The UNIX utility sar,
System Activity Reporter, supplies a low-cost method
that data: it is bundled with nearly all forms of UNIX
sar is readily available, you might still opt to avoid
sar easily qualifies as one of those rather esoteric
for which UNIX is often criticized. sar is difficult
difficult to interpret, and it lacks the interactive
which many people want. For those willing to make the
will deliver important perofrmance information.
sar has two forms: that of the data collector, and that
the data viewer. The options and explanations for the
are in Figure 1 , and for the viewer in Figure
What Will sar Monitor?
sar is capable of monitoring the following components
of the system:
CPU Utilization -- reported as the portion of
time spent in user mode, system mode, idle and waiting
for block I/O,
Buffer Activity -- the number of transfers per
second between system buffers and disk, accesses of
cache hit rations, and physical device data transfers.
Disk/Tape Devices -- consisting of busy/queue
requests, number of transfers to/from the devices, and
TTY Devices -- input and output character processing.
System Calls -- reported as counts for some
specific system calls and for the aggregate of all calls.
System Swapping -- transfers made for swapping
and process switching.
File System Access -- the number of calls to
routines like igets and namei.
Report on Run Queue -- the number of processes
in the run queue.
File/Inode and Process Tables -- the number
of entries and size of each table.
Message/Semaphores -- the number of primitives
Paging Activities -- counts page faults and
Unused Memory -- the amount of free memory and
free space on the swap device.
Remote File Sharing and Server/Request Queue --
statistics on RFS services.
In collection mode sar places its output in the file
where "??" is replaced with today's date.
For example, the
file created for June 26 would be sa26.
How It Works
sar samples a series of kernel counters at given intervals,
in order to provide a sense of what is happening on
the system during
peak hours. The actual data collection is typically
handled by shell
scripts that are scheduled by cron, as in Listing 1.
This example, which is fairly representative of common
shell scripts, namely sa1, and sa2. Both of these commands
are in /usr/lib/sa, while sar is in /usr/bin.
The sa1 shell script causes sar's output to be saved
to the file mentioned earlier, namely /usr/adm/sa/sa??.
first two entries in the crontab of Listing 1 cause
to be gathered every twenty minutes during normal working
hourly otherwise. This cross section provides a "reasonable"
view of the system utilization. You can change the parameters
when the sa1 command is run, if this level of granularity
not appropriate to your installation.
The second shell script, sa2, takes the output from
and builds a daily report, which is stored in /usr/adm/sa/sarDD,
where "DD" is the day of the month.
sa1 uses the separate command /usr/lib/sa/sadc, which
is the actual data collector for the facility, while
the facilities of sar to build the report. The command
arguments for sadc are also shown in Figure 1 . sadc
writes a binary record to standard output, or to the
file. The structure of this binary record (see Figure 3 ) is documented
in the manual pages, but is not part of any C header
If sadc is executed with no arguments, rather than report
syntax error, the command will write a special record
to the binary
file, indicating that the system counters have been
this special mode is used at boot time by one of the
scripts, usually /etc/rc2.d/S21perf.
sa2 uses the sar command to build the readable report.
The options which are used on the sa2 command line are
directly to sar, and are explained in Figure 1 .
The remainder of this article illustrates how to interpret
various reports. The example reports were generated
on a Motorola
8000 UNIX system, which was deliberately loaded to create
Monitoring CPU utilization
The -u option, which is also the default if no option
will instruct sar to report on the CPU utilization.
shows a sample report.
In each report, sar gives the system and node names,
number, the date, and selected averages. In the balance
of my examples,
I'll omit this information.
The CPU utilization report summarizes four data fields.
%usr is the
percentage of time operating in user mode, %sys is the
in system or kernel mode, %wio is the percentage of
time the CPU is
idle because it was waiting for a process's block I/O
and %idle represents time when the system wasn't doing
Note that Table 1 indicates a potential CPU bottleneck:
no remaining CPU cycles (0 %idle). Similarly, the report
in Table 2
indicates a problem with the disk subsystem.
Table 3 shows an example from a loaded down SCO UNIX
that an %idle of zero indicates that this machine has
to give the users in the area of raw computing power.
large %wio (say greater than 10 percent) could indicate
that the disk
subsystem is incapable of meeting demands. This problem
can be lessened
by using a second or alternate disk for some of the
data, or upgrading
the disk subsystem.
Buffers, Swapping, and Memory
The -b option instructs sar to report on the next potential
problem area, system buffer usage. Sample output appears
in Table 4
. Values appearing in a column with a `/s' in the heading
`hits' per second. The most critical values are the
and %wcache, which are the hit rates for disk reads
respectively. For optimum performance, the hit ratio
for both reads
and writes should be as close to 100 percent as possible.
if %rcache is less than 90 percent, then the size of
the buffer cache
should be increased. If %wcache is less than 70 percent,
buffer cache should be made bigger. Remember that dramatic
in the size of the buffer cache remove RAM from the
system and may
increase the swapping or paging traffic on your system.
To get a full picture of memory performance, you must
paging and swapping statistics. While you can always
hit ratios by increasing the number of buffers, beyond
a certain level
further increases will actually have an adverse effect
performance, because the excess buffers will decrease
the amount of
available memory for processes and data.
The -w option to sar reports on paging and swapping
activity (see Table 5 for an example).
The pswch/s value is the number of process switches
per second, which
is the number of processes which have been through the
CPU. In the
example above on our not-too busy system, two process
place in the first two five second intervals, and none
in the last.
(This is an example of a "lightly-loaded"
system.) If the
system doesn't have enough RAM, it will do some paging
to manage its memory deficiency. On systems running
at or near the
maximum potential, the paging and swapping components
may also reach
Many systems report how much RAM is really available
after all of
the kernel requirements are satisfied at boot time.
You can get similar
information anytime with sar's -r option. This option
will report the amount of free RAM and free swap space
at the moment
when the command was executed.
# sar -r 1 1
00:03:05 freemem freeswp
00:03:06 486 25696
The freemem value is the number of currently free
4Kb pages. On my 4 Meg system, sar reports 486 * 4096
free bytes. The freeswp value is the number of disk
available for swapping, in 512 byte blocks.
On systems which show that the freeswp value is raising
which indicates that swap space is being used, but the
by the -w option shows no swapping activity, then it
indicate that vhand is running to collect memory pages.
Because RAM is such a vital resource, it is common to
-rwp options to show the free RAM available, paging,
The -d option to sar doesn't work on all systems. For
example, my Motorola 8000 UNIX system does not report
when I use the -d option; however, my SCO 3.2v4 UNIX
The information reported here relates to the hard disk
and tape subsystems
only. The names of the devices are very system specific.
The example is shown in Table 6 is from an SCO UNIX
which has a 425 Megabyte IDE hard disk. The wd-0 device
to the first hard disk on the Western Digital style
In this case, the controller and disk processed 49.77
data (r+w/s) in 212.14 blocks (blks/s). This calculates
212.14 / 49.77 = 4.26 Megabytes per transfer approximately
This figure may indicate that the disk subsystem may
not be fast enough to cope with large data transfers
on this system.
In this output, the reports were on a per second basis,
so a 4.26
Megabyte per second transfer may be quite sufficient.
Additional important fields in this output are:
avwait -- the average time before the request
is passed to the controller
avserv -- the average time before the request
is processed by the controller
These two fields are significant as they represent how
well your disk controller is responding to the requests.
the number, the faster the request was processed and
to the application. Both of these values are in milliseconds.
After memory performance, the next most common performance
is terminal I/O load. As intelligent serial interface
more common, tty activity impacts the overall system
much less than when systems still relied on the main
CPU to process
all character input.
The -y option instructs sar to report on the tty
activity (see Table 7).
Each of the columns reports the number of characters
through the system
per second. On the example system, the values are usually
because a backup and a UUCP call to a neighboring machine
progress when Table 7 was generated, the numbers are
rawch and canch are the number of characters per second
processed through the raw and canonical interfaces.
the output character rate in characters per second.
sar can also report on the system call utilization,
by these subgroups: all system calls, read(S), write(S),
fork, and exec. An example report appears in Table 8 .
The scall field counts all system calls in the interval.
read(S), fork, and exec report the number of
calls to each respective routine. While you may enjoy
these call counts, you probably won't find it to be
Files, Inodes, and Processes
The process, file and inode tables are fixed in size,
and are adjusted
by following the kernel configuration instructions for
version of UNIX. The sar -v option reports on the usage
these resources. (You can get similar information from
Table 9 is a sample of the sar -v output on my Motorola
system. Since I am the primary (and usually the only)
user on this
machine, we can say that this is a single user system.
For each of four resources (proc-sz, process table;
indoe table; file-sz, file table; lock-sz, file locks),
the -v option reports the number currently allocated
by the maximum configured. For example, from the sample
are 26 processes out of a maximum of 128. The ov column
the number of overflows during the sampling period.
Table 10 shows the same report for one of my employers
SCO UNIX systems.
Even though the system in Table 10 was fairly idle,
half of the
process table was already consumed. The same is true
for both the
inode and file tables. Although it is possible for the
on this system to cope with these parameters, this configuration
Running out of any of these resources will result in
work being "lost,"
or in some aspect of the system not working correctly.
Evaluating the Data
Making sense out of the information provided from any
tool can be difficult, but here are some things to watch
evaluating the data.
When looking at I/O subsystem problems, if the %wio
is consistently higher than 10, then I would suggest
that both swapping
and buffers be examined.
Keep an eye on swapping. Consistent and prolonged swapping
indicates a memory deficiency, and may eventually lead
a condition indicated by almost constant disk activity
and very high
levels of swapping.
An overabundance of buffers will decrease the amount
of RAM available to the user and system processes, thereby
the likelihood of swapping. This problem can be overcome
more RAM to the system, or by decreasing kernel tunables,
The performance of the system can also be affected if
there aren't enough buffers. If the ratios reported
in the buffer
analysis are less than 80 percent for writes, and 90
percent for reads,
I would suggest creating a kernel with more buffers.
SCO UNIX and
XENIX will automatically calculate a buffer setting
based upon the
amount of available RAM. This calculated default can
Rectify and record any kernel error messages, such as
"file table overflow," as this indicates that
you file table
isn't large enough. Be warned, however, that making
the kernel tables
too large also removes needed RAM from the system by
to the kernel.
In assessing performance, remember that some factors
may be inherent
in the design of the software or the hardware, such
as different data
paths on the bus and the controller, i.e., an 8-bit
ISA card in a
32-bit bus machine. Also keep in mind that typically,
only a small
increase in system performance can be achieved by adjusting
parameters. Depending upon what is adjusted, and to
what levels, this
may make things worse.
While the art (or science) of performance tuning is
in mist, a little common sense will help you see better
when adjusting kernel parameters, go small -- too big
may be too
About the Author
Chris Hare is Ottawa Technical Services Manager for
Choreo Systems, Inc.
He has worked in the UNIX environment since 1986 and
in 1988 became one of
the first SCO authorized instructors in Canada. He teaches
system administration, and programming classes. His
current focus is on
networking, Perl, and X. Chris can be reached at firstname.lastname@example.org,
email@example.com, which is his home.