Article

A Method for Verifying System Integrity

Packey Velleca

Why System Verification Is Needed

In a development environment, any of several groups of people -- Development, Integration, Test, and Training groups -- may use similar platforms configured differently over the life of a product. These groups often find they need an easy way to verify a system's configuration or to discern one system from another.

System verification determines whether a machine's hardware and software are in a predefined configuration (baseline). If not at the baseline configuration, how far from baseline is the configuration? There are many reasons to do automated configuration checks:

To ensure that a group of users has consistent system resources available across a set of systems.

To verify that a user's working environment is a standard one.

To monitor and identify system parameters that affect performance.

The verify script in Listing 1 shows an automated, reliable, easy to use and maintain tool for verifying a system configuration. This article addresses each of the problems listed above and contains a technical description of the tool itself.

Ease of Use

At a site with many different systems, an administrator is often unable to check a system configuration as soon as a user requests the check. So it is convenient to enable users to differentiate between system configurations. With this script, any user can verify a system configuration without having to learn new commands, new techniques, or obscure system calls. Placing this script in their path gives users a fast, reliable method of determining system configuration on their own.

Consistent System Resources

Changes to a system configuration may not immediately ripple through to all machines, and this can confuse some users about the correct configuration. For example, suppose Development initially defines 60 shared memory segments as adequate for a system. Integration discovers this is not enough, so they change the kernel to support 120 shared memory segments. Now two potentially different configurations exist for the same system. If Development is not aware of the change, they can waste time chasing down an old problem. A user may suspect the kernel has not been updated, but few reliable methods exist to verify this. The kernel configuration file can be manually checked, but since users can find or make kernels on any system and copy them over, there is no guarantee that the local configuration file belongs to that kernel. Checksumming the kernel may show the two kernels as different, but it cannot tell how they differ. Changing the hostname may deliver a new checksum, but a new checksum doesn't identify a new hostname as the cause. In this case, the best way to differentiate kernels is to create a tool that reads the kernel variable directly, and incorporate it into this script. Any user can run the script to easily verify a system.

Also consider that if a set of machines in configured the same during development and integration, your organization will spend less time troubleshooting platform interface and dependency problems during the integration and delivery phases of a product. This tool can help provide that consistency.

Standard Working Environment

If you use a standard user environment, you must have a way to ensure that important settings have not been changed. For example, if you require certain aliases, masks, or paths, you can use verify to easily and consistently verify that users have not modified their environment in any detrimental way.

Tuning Tool

Monitoring system parameters with verify is easy. For every parameter you suspect is too low or too high, place a check in the script. For example, if the kernel has been configured to support 100 processes, you can compare the current number of processes to the maximum value built into the kernel and better estimate its optimum value.

Other Methods of Verification

Generally, all verification methods have one aspect in common: they compare a current configuration to a baseline configuration. One verification method does this by calculating checksums on a set of files and comparing those sums to the baseline sums. This method can work fine, though slowly, for a small set of binary executables, such as kernels, but is unacceptable for databases, configuration files, and network routing tables. The contents of these files are imprecise. For instance, these files often have spaces and tabs used interchangeably. You may be concerned not with the contents of an object but with its attributes, such as file permissions and file ownership.

Another verification method has an administrator manually check each machine. This method is extremely labor intensive, prone to error, and boring. The task becomes unmanageable when a site has a moderate number of systems.

The verify script combines the best features of both verification methods in one automated script. It checksums the contents of a small number of files. It performs content checks on configuration files. verify also compares baseline values against dynamic configuration items, such as routing info, and compares file attributes. verify is easy to maintain and distribute. Used from a central file server, it allows one administrator to quickly, reliably, and automatically determine the configuration of a set of systems.

Technical Description

The script is invoked by a privileged user from a shell. It takes one command line argument, "-v" (verbose), that writes to standard output the status of all configuration items, whether in baseline or not. Without the "-v" switch, the script defaults to output only configuration items that are out of baseline.

The most often used operation in the script is:

-- Get the current state of a configuration item from the system.

-- Compare the current state to the baseline state and determine which items are:

-- within the baseline,

-- extraneous to the baseline, and

-- missing from the baseline.

This operation is used for filesystem mounts, partitions, network routes, and other items where resources can be added, deleted, or changed. Another often used step is:

-- Check existence of a configuration item

This step verifies file existence, file permission, file checksums, application program existence, and product licenses.

The script itself contains all the baseline configuration data. There are no external configuration data files in this example because in practice each system/baseline is unique enough to warrant a unique script. And as most sites run more than one platform/system, it is much easier to maintain a single file rather than a command file and multiple data files for each system. If this is not the case at your site, then by all means set up a command file that uses configuration items in separate data files.

See Figure 1 for a sample run of the script invoked with the "-v" option. Note that baseline configuration items are tagged with the prefix "Baseline". Non-baseline items are tagged "???????>". Without the "-v" option, only the non-baseline items are written.

The power of this method is in its flexibility. Any verification that can be done on a system can be written into that system's script and executed many times faster and much more reliably than by hand. You can also use verify in a function that allows unattended verification of all machines in a set, with non-baseline items automatically written to a file and mailed to interested parties.

It is also possible, though not described here, to automate insertion of baseline parameters into this script. Once a baseline is defined, another script could extract pertinent data from the baseline system and write it into this script. You might write such a script if the baseline changes often and the number of configuration items is moderate to large.

Porting the Script to Other Operating Systems

The script as presented here was written for DEC Ultrix (BSD), but is easy to port to other operating systems. Each configuration item within the script has a precise data format and example. Porting is a matter of massaging data into that format using common UNIX tools like awk(1), sed(1), or cut(1). We successfully ported the script to an SVID system in only two days including test time.

verify, as presented, checks about 100 configuration items. It runs in about 30 seconds of real time per machine. Considering that performing the same number of checks with the same accuracy can take as long as 20-30 minutes per machine, preparing a verification script is well worth the effort.

About the Author

Packey Velleca received a BSEE degree from the Florida Institute of Technology in 1988. He currently works with a group of administrators for a small system of real-time computers with UNIX-based operator workstations. His earlier article, Reproducing Boot-Disk Filesystems, appeared in the Jan/Feb 1994 issue of Sys Admin.