The
Solaris Companion: Reliable and Practical Root Disk Mirroring
Peter Baer Galvin
Content Level: Advanced
Content Audience: Solaris Administrators and Managers
Abstract
There are several approaches to mirroring the root disk on Solaris
machines. In this article, all of the common ones are explored, and a
"best practice" is suggested that meets the requirements of being
functional and reliable.
The Problem
Disks are the component most likely to fail on a Solaris machine. Both
experience and theory show this to be the case. Some disk failures
cause little or no distress to the system or its users (e.g.,
an unused disk, or one used for temporary work). Disks with crucial
data, if lost, cause serious angst amongst everyone involved. A
replacement disk and tape restore can resolve this problem, to a great
extent.
Then there is the root disk -- if you lose it, the system is
unavailable. Getting the system up and running can involve contortions
of CD-ROM boots, backup software implementation, alternate-disk
restoration, alternate-disk boot, and finally, restore of the original
root-disk contents from backup tape. You do have backups of the root
disk, right?
Let's take a look at the root-disk mirroring options. As I will show,
there is no one, perfect, out-of-the-box solution. Rather, there is a
"best practice" solution, developed at Corporate Technologies by Manny
Korkodilos and Kyle Oliver, which gives the best of all worlds, while also
avoiding the problems that haunt the other solutions.
Before You Proceed
Please do not try any of these solutions on your most crucial production
server. The best place to try them is in a "sandbox", on a system
that is disposable. It is also best to implement disk mirroring at
system installation time, not after it is in production. Implementing
mirroring on a production system, without losing data, is a
challenge. Is that enough of a warning!?
Requirements
The best disk-mirroring solution should include these features:
- It must automatically recover from a single disk failure.
- It must allow easy removal for system upgrades.
- It must not adversely affect performance.
- It must allow other disk management solutions to be installed
and used.
As I will show, none of the obvious solutions satisfactorily solves these
problems. Of course, your requirements may vary in any given situation. In
those cases, all mirroring options should be considered and the best
fit should be chosen.
Methods
Periodic Disk Copy
A fairly standard solution to the problem of losing root-disk data is
to script a periodic copy of the data to another disk. Several scripts
for this purpose have been published (including one by me, which
may get updated here in the future). In general, they use
dd or ufsdump/ufsrestore to copy each
partition of the root disk to a backup disk. installboot
is needed to make the backup disk bootable, and files such as
/etc/vfstab must be changed to match the backup disk's
parameters.
Unfortunately, this solution does not fit in most circumstances. If
the root disk fails, the system will crash. Then, someone must issue
commands to boot the system from the alternate disk. (This someone is
typically you, at 3AM.) The problem must then be resolved and
the contents copied back to the replacement root disk, with another
reboot to reset the system to its original state.
This solution is suitable in some circumstances, however, such as a
less important machine with little uptime requirement. It can,
also, be used to augment any of the other methods described
here. For instance, DiskSuite can mirror the root disk automatically,
and a disk copy can assure that even if DiskSuite fails, or (more
likely) someone executing an unfortunate command which damages the
system, the system can be rapidly recovered.
Solstice DiskSuite
DiskSuite is free and included with Solaris. It can be used
for more than mirroring, and for more than the root disk. It has a GUI
as well as a command-line interface. Even experienced DiskSuite users
steer away from using it on complex disk configurations or on more
than a few disks. It tends to scale poorly in terms of its
manageability. However, at first blush, it appears to be the perfect
root-mirroring tool.
The first time you use DiskSuite, it is complex to implement. With
experience, however, it becomes a straightforward and useful tool. It
also meets the first three of our requirements.
If either disk of a root-disk mirror pair fails, the other will be
used and the system will continue running. For operating-system
upgrades, DiskSuite is unconfigured and removed (disabling disk
mirroring). (Upgrading to Solaris 8 will be covered in a future column.)
The upgrade is performed, and DiskSuite is reimplemented. With RAID
level 1, performance loss on the root disk is minimized.
Unfortunately it fails the last test, but this takes some
explaining...
Veritas Volume Manager
Servers with multiple disks or multiple arrays are typically managed
by Veritas Volume Manager (VXVM). VXVM provides both GUI and command-line interfaces, and is designed to manage hundreds or even thousands
of disks. It provides functionality for sharing disks between servers,
and for control of disks within a cluster and during cluster
failover. It is by far the most common disk-management system on
large Sun servers.
VXVM requires that one or more disks be included in "the rootdg", a
"disk group" that holds configuration information for all other disk
groups. This diskgroup cannot be on a shared array (i.e., one attached
to two servers in a cluster) because the diskgroup can never be
"exported" from one machine and "imported" on another. Thus no disk in
rootdg can be made available to any other machine, ever. Therefore, experienced
administrators put the boot disk, or other local disks, in
rootdg. They put all other disks in other disk groups.
Consider that if a database server fails, you might want to move the
databases disks to another functional host until the problem is
resolved. If those disks are in rootdg, they cannot be imported to the
new system. However, if in another disk group, this operation is
easily implemented.
So what's the problem? We can put the boot disk, plus another
internal disk, in rootdg, and use VXVM to mirror between the two.
Unfortunately, there's trouble in paradise. First, the root disk is
built before VXVM is installed. For the root disk to be managed by
VXVM, it must be "encapsulated". This process makes room for VXVM to
add its management information to the disk, without having to rebuild
the disk. Unfortunately, if you ever want to upgrade Solaris on this
system, you must unencapsulate the disk first (because the upgrade
procedures do not recognize the boot disk as a Solaris disk).
To understand another problem, consider the case of the root disk failing and its secondary mirror working properly. Naturally, the root disk is replaced and VXVM will re-mirror from the secondary. In this case the root disk is no longer encapsulated. Rather, it is a VXVM managed disk. Unfortunately, Solaris upgrades do not know how to deal with VXVM managed disks, so you can no longer easily upgrade that system. Even worse, you cannot even boot from CDROM and manually mount the root disk partitions (to recover from a lost root password or a dozen other problems)!
There even used to be a Sun Blueprint that recommended the following
steps:
- Encapsulate the root disk
- Mirror to secondary disk
- Tell VXVM that the root disk went bad
- Tell VXVM to remirror to the root disk
The net result was that your system had a VXVM managed boot disk, and it could no longer be upgraded nor the disk mounted from CDROM boot! I can no longer find that Blueprint, so hopefully it was rescinded. There is another root disk Blueprint that appears quite good at first blush: VXVMReference Blueprint (http://www.sun.com/blueprints/0800/vxvmref.pdf).
But most problematic is that there have been many instances of a VXVM
mirrored root disk failing, and the mirrored copy not automatically
taking its place. The system either crashes or hangs until the
problem is resolved. This fact is not widely known because the sample
number of systems with VXVM-mirrored root disks having one of
those disks fail is small. This is a trend that we have seen, however. VXVM
root-disk mirroring has been known to work, especially in a clean
failure condition, such as one of the disks being removed for mirroring
testing, but it is just as likely to fail. In fact,
please send emailmailto:pbg@petergalvin.org if you have
experiences to share in this area.
Thus, an internal disk must be in rootdg, but VXVM should not be used to
mirror the root disk on Solaris systems. On two-disk systems, such as
Netra T1s, Sunblade 100s and 1000s, 220Rs, and 420Rs, we have a
problem. We would rather use Disksuite for mirroring, but need to have
one of the disks in the rootdg to keep VXVM happy.
The Best of All Worlds
The solution is to combine these two products. Through quite a bit of
work, you can use Disksuite to mirror the root disks, but carve out a
small partition and make that the rootdg. The effort is worth
while, as this solution meets all four of the criteria:
- It (almost always) automatically recovers from the failure of one
of the mirror pair of disks. Note, however, that the disk failure must
be detected and corrected in order to avoid having the other disk fail,
taking down the system! Seriously, this has occurred at sites that lack
sufficient attention.
- To upgrade a system in this configuration, DiskSuite must be
removed, but VXVM can stay installed, unaffected by the upgrade.
- Performance is the same as Disksuite monitoring alone -- that
is, quite good.
- Disksuite can be used for what it is good at -- management of a
few disks, and VXVM can do its job of managing the rest.
Next Month
Next month's column will include the step-by-step instructions for
building a combination Disksuite and VXVM system.
I hope this first installment of the Solaris Corner column is useful
for you. Over the coming months, I expect to have quite a lot of
useful information from a variety of experts in the field of Solaris
administration.
Also, if you feel that you have information to share with
your fellow Solaris administrators, please let me know. We would like to publish articles, book reviews, helpful tips and tricks, conference reports, and of course, useful resource pointers. Fame and fortunate (well, at least a little of each) are yours for the asking.
Peter Baer Galvin (http://www.petergalvin.org) is the Chief Technologist for Corporate Technologies, a premier systems integrator and VAR. Before that, Peter was the systems manager for Brown University's Computer Science Department. He has written articles for Byte and other magazines, and previously wrote Pete's Wicked World, the security column, and Pete's Super Systems, the systems management column for Unix Insider (http://www.unixinsider.com). Peter is coauthor of the Operating Systems Concepts and Applied Operating Systems Concepts textbooks. As a consultant and trainer, Peter has taught tutorials and given talks on security and system administration worldwide.
|