Cover V10, I05
Article

may2001.tar


The Solaris Companion: Reliable and Practical Root Disk Mirroring

Peter Baer Galvin

Content Level: Advanced
Content Audience: Solaris Administrators and Managers

Abstract

There are several approaches to mirroring the root disk on Solaris machines. In this article, all of the common ones are explored, and a "best practice" is suggested that meets the requirements of being functional and reliable.

The Problem

Disks are the component most likely to fail on a Solaris machine. Both experience and theory show this to be the case. Some disk failures cause little or no distress to the system or its users (e.g., an unused disk, or one used for temporary work). Disks with crucial data, if lost, cause serious angst amongst everyone involved. A replacement disk and tape restore can resolve this problem, to a great extent.

Then there is the root disk -- if you lose it, the system is unavailable. Getting the system up and running can involve contortions of CD-ROM boots, backup software implementation, alternate-disk restoration, alternate-disk boot, and finally, restore of the original root-disk contents from backup tape. You do have backups of the root disk, right?

Let's take a look at the root-disk mirroring options. As I will show, there is no one, perfect, out-of-the-box solution. Rather, there is a "best practice" solution, developed at Corporate Technologies by Manny Korkodilos and Kyle Oliver, which gives the best of all worlds, while also avoiding the problems that haunt the other solutions.

Before You Proceed

Please do not try any of these solutions on your most crucial production server. The best place to try them is in a "sandbox", on a system that is disposable. It is also best to implement disk mirroring at system installation time, not after it is in production. Implementing mirroring on a production system, without losing data, is a challenge. Is that enough of a warning!?

Requirements

The best disk-mirroring solution should include these features:

  • It must automatically recover from a single disk failure.
  • It must allow easy removal for system upgrades.
  • It must not adversely affect performance.
  • It must allow other disk management solutions to be installed and used.

As I will show, none of the obvious solutions satisfactorily solves these problems. Of course, your requirements may vary in any given situation. In those cases, all mirroring options should be considered and the best fit should be chosen.

Methods

Periodic Disk Copy

A fairly standard solution to the problem of losing root-disk data is to script a periodic copy of the data to another disk. Several scripts for this purpose have been published (including one by me, which may get updated here in the future). In general, they use dd or ufsdump/ufsrestore to copy each partition of the root disk to a backup disk. installboot is needed to make the backup disk bootable, and files such as /etc/vfstab must be changed to match the backup disk's parameters.

Unfortunately, this solution does not fit in most circumstances. If the root disk fails, the system will crash. Then, someone must issue commands to boot the system from the alternate disk. (This someone is typically you, at 3AM.) The problem must then be resolved and the contents copied back to the replacement root disk, with another reboot to reset the system to its original state.

This solution is suitable in some circumstances, however, such as a less important machine with little uptime requirement. It can, also, be used to augment any of the other methods described here. For instance, DiskSuite can mirror the root disk automatically, and a disk copy can assure that even if DiskSuite fails, or (more likely) someone executing an unfortunate command which damages the system, the system can be rapidly recovered.

Solstice DiskSuite

DiskSuite is free and included with Solaris. It can be used for more than mirroring, and for more than the root disk. It has a GUI as well as a command-line interface. Even experienced DiskSuite users steer away from using it on complex disk configurations or on more than a few disks. It tends to scale poorly in terms of its manageability. However, at first blush, it appears to be the perfect root-mirroring tool.

The first time you use DiskSuite, it is complex to implement. With experience, however, it becomes a straightforward and useful tool. It also meets the first three of our requirements.

If either disk of a root-disk mirror pair fails, the other will be used and the system will continue running. For operating-system upgrades, DiskSuite is unconfigured and removed (disabling disk mirroring). (Upgrading to Solaris 8 will be covered in a future column.) The upgrade is performed, and DiskSuite is reimplemented. With RAID level 1, performance loss on the root disk is minimized.

Unfortunately it fails the last test, but this takes some explaining...

Veritas Volume Manager

Servers with multiple disks or multiple arrays are typically managed by Veritas Volume Manager (VXVM). VXVM provides both GUI and command-line interfaces, and is designed to manage hundreds or even thousands of disks. It provides functionality for sharing disks between servers, and for control of disks within a cluster and during cluster failover. It is by far the most common disk-management system on large Sun servers.

VXVM requires that one or more disks be included in "the rootdg", a "disk group" that holds configuration information for all other disk groups. This diskgroup cannot be on a shared array (i.e., one attached to two servers in a cluster) because the diskgroup can never be "exported" from one machine and "imported" on another. Thus no disk in rootdg can be made available to any other machine, ever. Therefore, experienced administrators put the boot disk, or other local disks, in rootdg. They put all other disks in other disk groups.

Consider that if a database server fails, you might want to move the databases disks to another functional host until the problem is resolved. If those disks are in rootdg, they cannot be imported to the new system. However, if in another disk group, this operation is easily implemented.

So what's the problem? We can put the boot disk, plus another internal disk, in rootdg, and use VXVM to mirror between the two. Unfortunately, there's trouble in paradise. First, the root disk is built before VXVM is installed. For the root disk to be managed by VXVM, it must be "encapsulated". This process makes room for VXVM to add its management information to the disk, without having to rebuild the disk. Unfortunately, if you ever want to upgrade Solaris on this system, you must unencapsulate the disk first (because the upgrade procedures do not recognize the boot disk as a Solaris disk).

To understand another problem, consider the case of the root disk failing and its secondary mirror working properly. Naturally, the root disk is replaced and VXVM will re-mirror from the secondary. In this case the root disk is no longer encapsulated. Rather, it is a VXVM managed disk. Unfortunately, Solaris upgrades do not know how to deal with VXVM managed disks, so you can no longer easily upgrade that system. Even worse, you cannot even boot from CDROM and manually mount the root disk partitions (to recover from a lost root password or a dozen other problems)!

There even used to be a Sun Blueprint that recommended the following steps:

  • Encapsulate the root disk
  • Mirror to secondary disk
  • Tell VXVM that the root disk went bad
  • Tell VXVM to remirror to the root disk

The net result was that your system had a VXVM managed boot disk, and it could no longer be upgraded nor the disk mounted from CDROM boot! I can no longer find that Blueprint, so hopefully it was rescinded. There is another root disk Blueprint that appears quite good at first blush: VXVMReference Blueprint (http://www.sun.com/blueprints/0800/vxvmref.pdf).

But most problematic is that there have been many instances of a VXVM mirrored root disk failing, and the mirrored copy not automatically taking its place. The system either crashes or hangs until the problem is resolved. This fact is not widely known because the sample number of systems with VXVM-mirrored root disks having one of those disks fail is small. This is a trend that we have seen, however. VXVM root-disk mirroring has been known to work, especially in a clean failure condition, such as one of the disks being removed for mirroring testing, but it is just as likely to fail. In fact, please send emailmailto:pbg@petergalvin.org if you have experiences to share in this area.

Thus, an internal disk must be in rootdg, but VXVM should not be used to mirror the root disk on Solaris systems. On two-disk systems, such as Netra T1s, Sunblade 100s and 1000s, 220Rs, and 420Rs, we have a problem. We would rather use Disksuite for mirroring, but need to have one of the disks in the rootdg to keep VXVM happy.

The Best of All Worlds
The solution is to combine these two products. Through quite a bit of work, you can use Disksuite to mirror the root disks, but carve out a small partition and make that the rootdg. The effort is worth while, as this solution meets all four of the criteria:

  • It (almost always) automatically recovers from the failure of one of the mirror pair of disks. Note, however, that the disk failure must be detected and corrected in order to avoid having the other disk fail, taking down the system! Seriously, this has occurred at sites that lack sufficient attention.
  • To upgrade a system in this configuration, DiskSuite must be removed, but VXVM can stay installed, unaffected by the upgrade.
  • Performance is the same as Disksuite monitoring alone -- that is, quite good.
  • Disksuite can be used for what it is good at -- management of a few disks, and VXVM can do its job of managing the rest.
Next Month

Next month's column will include the step-by-step instructions for building a combination Disksuite and VXVM system.

I hope this first installment of the Solaris Corner column is useful for you. Over the coming months, I expect to have quite a lot of useful information from a variety of experts in the field of Solaris administration.

Also, if you feel that you have information to share with your fellow Solaris administrators, please let me know. We would like to publish articles, book reviews, helpful tips and tricks, conference reports, and of course, useful resource pointers. Fame and fortunate (well, at least a little of each) are yours for the asking.

Peter Baer Galvin (http://www.petergalvin.org) is the Chief Technologist for Corporate Technologies, a premier systems integrator and VAR. Before that, Peter was the systems manager for Brown University's Computer Science Department. He has written articles for Byte and other magazines, and previously wrote Pete's Wicked World, the security column, and Pete's Super Systems, the systems management column for Unix Insider (http://www.unixinsider.com). Peter is coauthor of the Operating Systems Concepts and Applied Operating Systems Concepts textbooks. As a consultant and trainer, Peter has taught tutorials and given talks on security and system administration worldwide.