In a development environment, any of several groups
of people --
Development, Integration, Test, and Training groups
-- may use
similar platforms configured differently over the life
of a product.
These groups often find they need an easy way to verify
a system's
configuration or to discern one system from another.
System verification determines whether a machine's hardware
and software
are in a predefined configuration (baseline). If not
at the baseline
configuration, how far from baseline is the configuration?
There are
many reasons to do automated configuration checks:
To monitor and identify system parameters that affect
performance.
The verify script in Listing 1 shows an automated, reliable,
easy
to use and maintain tool for verifying a system configuration.
This
article addresses each of the problems listed above
and contains a
technical description of the tool itself.
Ease of Use
At a site with many different systems, an administrator
is often unable
to check a system configuration as soon as a user requests
the check.
So it is convenient to enable users to differentiate
between system
configurations. With this script, any user can verify
a system configuration
without having to learn new commands, new techniques,
or obscure system
calls. Placing this script in their path gives users
a fast, reliable
method of determining system configuration on their
own.
Consistent System Resources
Changes to a system configuration may not immediately
ripple through
to all machines, and this can confuse some users about
the correct
configuration. For example, suppose Development initially
defines
60 shared memory segments as adequate for a system.
Integration discovers
this is not enough, so they change the kernel to support
120 shared
memory segments. Now two potentially different configurations
exist
for the same system. If Development is not aware of
the change, they
can waste time chasing down an old problem. A user may
suspect the
kernel has not been updated, but few reliable methods
exist to verify
this. The kernel configuration file can be manually
checked, but since
users can find or make kernels on any system and copy
them over, there
is no guarantee that the local configuration file belongs
to that
kernel. Checksumming the kernel may show the two kernels
as different,
but it cannot tell how they differ. Changing the hostname
may deliver
a new checksum, but a new checksum doesn't identify
a new hostname
as the cause. In this case, the best way to differentiate
kernels
is to create a tool that reads the kernel variable directly,
and incorporate
it into this script. Any user can run the script to
easily verify
a system.
Also consider that if a set of machines in configured
the same during
development and integration, your organization will
spend less time
troubleshooting platform interface and dependency problems
during
the integration and delivery phases of a product. This
tool can help
provide that consistency.
Standard Working Environment
If you use a standard user environment, you must have
a way to ensure
that important settings have not been changed. For example,
if you
require certain aliases, masks, or paths, you can use
verify
to easily and consistently verify that users have not
modified their
environment in any detrimental way.
Tuning Tool
Monitoring system parameters with verify is easy. For
every
parameter you suspect is too low or too high, place
a check in the
script. For example, if the kernel has been configured
to support
100 processes, you can compare the current number of
processes to
the maximum value built into the kernel and better estimate
its optimum
value.
Other Methods of Verification
Generally, all verification methods have one aspect
in common: they
compare a current configuration to a baseline configuration.
One verification
method does this by calculating checksums on a set of
files and comparing
those sums to the baseline sums. This method can work
fine, though
slowly, for a small set of binary executables, such
as kernels, but
is unacceptable for databases, configuration files,
and network routing
tables. The contents of these files are imprecise. For
instance, these
files often have spaces and tabs used interchangeably.
You may be
concerned not with the contents of an object but with
its attributes,
such as file permissions and file ownership.
Another verification method has an administrator manually
check each
machine. This method is extremely labor intensive, prone
to error,
and boring. The task becomes unmanageable when a site
has a moderate
number of systems.
The verify script combines the best features of both
verification
methods in one automated script. It checksums the contents
of a small
number of files. It performs content checks on configuration
files.
verify also compares baseline values against dynamic
configuration
items, such as routing info, and compares file attributes.
verify
is easy to maintain and distribute. Used from a central
file server,
it allows one administrator to quickly, reliably, and
automatically
determine the configuration of a set of systems.
Technical Description
The script is invoked by a privileged user from a shell.
It takes
one command line argument, "-v" (verbose),
that writes to
standard output the status of all configuration items,
whether in
baseline or not. Without the "-v" switch,
the script defaults
to output only configuration items that are out of baseline.
The most often used operation in the script is:
-- Get the current state of a configuration
item from the system.
-- Compare the current state to the baseline
state and determine which items are:
-- within the baseline,
-- extraneous to the baseline, and
-- missing from the baseline.
This operation is used for filesystem mounts, partitions,
network
routes, and other items where resources can be added,
deleted, or
changed. Another often used step is:
-- Check existence of a configuration item
This step verifies file existence, file permission,
file checksums,
application program existence, and product licenses.
The script itself contains all the baseline configuration
data. There
are no external configuration data files in this example
because in
practice each system/baseline is unique enough to warrant
a unique
script. And as most sites run more than one platform/system,
it is
much easier to maintain a single file rather than a
command file and
multiple data files for each system. If this is not
the case at your
site, then by all means set up a command file that uses
configuration
items in separate data files.
See Figure 1 for a sample run of the script invoked
with the "-v"
option. Note that baseline configuration items are tagged
with the
prefix "Baseline". Non-baseline items are
tagged "???????>".
Without the "-v" option, only the non-baseline
items are written.
The power of this method is in its flexibility. Any
verification that
can be done on a system can be written into that system's
script and
executed many times faster and much more reliably than
by hand. You
can also use verify in a function that allows unattended
verification
of all machines in a set, with non-baseline items automatically
written
to a file and mailed to interested parties.
It is also possible, though not described here, to automate
insertion
of baseline parameters into this script. Once a baseline
is defined,
another script could extract pertinent data from the
baseline system
and write it into this script. You might write such
a script if the
baseline changes often and the number of configuration
items is moderate
to large.
Porting the Script to Other Operating Systems
The script as presented here was written for DEC Ultrix
(BSD), but
is easy to port to other operating systems. Each configuration
item
within the script has a precise data format and example.
Porting is
a matter of massaging data into that format using common
UNIX tools
like awk(1), sed(1), or cut(1). We successfully
ported the script to an SVID system in only two days
including test
time.
verify, as presented, checks about 100 configuration
items.
It runs in about 30 seconds of real time per machine.
Considering
that performing the same number of checks with the same
accuracy can
take as long as 20-30 minutes per machine, preparing
a verification
script is well worth the effort.
About the Author
Packey Velleca received a BSEE degree from the Florida
Institute
of Technology in 1988. He currently works with a group
of administrators
for a small system of real-time computers with UNIX-based
operator
workstations. His earlier article, Reproducing Boot-Disk
Filesystems,
appeared in the Jan/Feb 1994 issue of Sys Admin.