Sun's
Volume Manager
Peter Baer Galvin
It used to be Solstice DiskSuite and Solaris Disk Suite. Now it's
Solaris Volume Manager. Was the name changed to highlight new functionality,
or to protect the guilty? This month, the Solaris Corner takes it
for a spin to determine which is the case.
Overview
As with the past couple of Solaris Companion columns, this one
addresses a new Solaris 9 feature. This month, I'll examine
Solaris Volume Manager (SVM) and take a close look at its features,
problems, and a field trial. Does SVM give Veritas Volume Manager
some competition on large disk-space machines? Probably not. But
it does have sufficient utility to be used on small systems, as
did DiskSuite, and even on mid-sized machines with moderate disk
space. In the remainder of this column, I'll look at the features
and functions of SVM, walk through an implementation, and evaluate
the final results.
Features
SVM is now a full member of the Solaris product. It is installed
by default, not separately. Its integration is a welcome relief
to those who need volume management. In the old days (Solaris 8
and below), any disk management had to be removed from any system
disks before upgrades could be performed, and had to be re-implemented
after the upgrade. With SVM, upgrades are supposed to be knowledgeable
of it and not require it to get special treatment.
Other important changes to SVM were long in coming, but are still
welcome. It now has a concept of virtual disks, in which it dices
a physical disk into N virtual disks (which are variously called
"volumes" and "metadevices"). "N"
can be 8192 at the extreme, but it is limited by default to 128.
SVM used to be limited to the standard eight partitions. A volume
can be "sliced" to contain its own virtual partitions.
SVM also has the idea of diskgroups (called "disk sets").
This is a set of disks that travels together and cannot be split
between systems. This is useful for clustering, for example, where
all of the disks involved in an Oracle instance would need to be
mounted together on one machine.
Another nice change is that volume names are stored within the
volumes. A disk moved from one physical slot to another will not
be mistaken for the disk that used to occupy that slot, resulting
in data loss or worse (that is, job loss). Rather, the system will
read the volume information, determine what is on the disk, and
reinitiate RAID protection or whatever is appropriate.
SVM also retains some old features. A hot spare pool provides
slices to be used as replacements for failed RAID-protected slices.
SVM supports the usual RAID suspects -- RAID 0, 1, and 5. It
is flexible enough to allow you to build RAID 0+1 (stripe and then
mirror) and RAID 1+0 (mirror and then stripe) volumes (although
I did not test these). It still uses state databases and replicas
to maintain the state of the entire configuration. It still provides
"transaction volumes" for logging UFS metadata changes,
but that has been superceded by the logging that is now built into
UFS.
The documents regarding SVM are encompassing and useful. They
provide not only "how to" information, but also guidelines
on the best ways to use RAID, best disk layouts, naming conventions
to use, best use of file systems, performance optimization, and
so on.
Installation
The test system is a SunBlade 100 with two internal IDE disks.
The system is generic, running unpatched Solaris 9. Before using
SVM, I was a good worker bee and read through the Solaris 9 Runtime
Issues release notes. It was disappointing to see six separate issues
having to do with SVM. All of the issues seem avoidable with planning,
so if you are planning on using S9 SVM, I recommend reading up on
these issues. Another disappointment is Sun's choice to call
this facility Solaris Volume Manager. There is already the Veritas
Volume Manager (VXVM). Sun also OEM's this facility as the
Sun StorEdge Volume Manager (SEVM). Great attention must now be
paid when looking at bug reports and patches to ensure that they
are addressing the correct volume manager. Many bug and patch reports
simply talk of "volume manager" and must be read in detail
to see exactly to what they are referring. That said, there did
not seem to be any SVM-specific patches available at the
time of this writing (August 2002). DiskSuite seemed like a fine
name to me...
As mentioned, SVM is installed by default with the full Solaris
package set. Management of SVM is either by the "Enhanced Storage
tool" within the Sun Management Console or via the command
line. The documentation warns not to try to use both methods at
once or bad things could happen. For this test, I used the GUI,
just to be different. It is started via the command /usr/sbin/smc.
Navigating through "This Computer" to "storage"
and then "enhanced storage" leads to the screen shown
in Figure 1.
As with DiskSuite, the first step with SVM on a new machine is
to create state database replicas. Choosing the replica tool in
SMC, then using the action->create replicas menu selection allowed
creation. Unfortunately, the root disk on the test system was fully
partitioned, with no unused disk space that could be used by SVM.
When installing the system, I performed a custom disk layout, so
it is unclear whether the new default disk layout would have left
space free for SVM to use. To remedy the situation, I carved out
a 50-MB partition from the swap space. First, I did a swap -d
of the root disk swap partition to un-allocate it, allowing it to
be repartitioned. Second, I used format to add a slice 5
to the system, made from the last 50 MB of slice 0 (swap). Labeling
the disk made it happen. (At that point my Open Windows session
exited -- possibly because I needed space to page to but had
no swap space allocated. I had further stability problems with OpenWindows,
and decided to install the current Solaris 9 patch cluster.)
While I was in a partitioning mood, I partitioned my second disk
the same way. It was unclear whether SMC and SVM would have chosen
to duplicate the root disk layout, so I forced the issue. Next,
I chose to create replicas on slice 5 of the two disks. I put two
replicas per slice, as a minimum of three slices is required. Obviously,
three disks with one replica each would have been optimal. I enabled
"show commands" within SMC, and it obliged by showing
what it was about to do.
Determining how to mirror an existing file system partition was
more of a challenge. The SVM documentation provides both GUI and
command-line directions for mirroring a partition that cannot be
unmounted (e.g., the root disk partitions). Unfortunately, following
the instructions for the GUI method failed, because it reported
that the partition was mounted and did not give an option for using
a mounted partition. Thus, the GUI had to be abandoned and the command
line used:
metainit -f d1 1 1 c0t0d0s0 -- Create a concat RAID
0 slice from the existing slice.
metainit d2 1 1 c0t2d0s0 -- Find the equivalent unused
slice on the mirror disk and make it a concat RAID 0 as well.
metainit d0 -m d1 -- Create a one-way mirror from the
existing RAID 0.
metaroot d0 -- Change /etc/vfstab and /etc/system to
use the new mirror (reboot the system).
metattach d0 d2 -- Mirror the existing RAID 0 one-way
mirror to the second disk.
metastat -- To check SVM status and assure commands
executed properly.
Note that for complete functionality, you must change the eeprom
to use the d2 slice as the alternate boot device. In this way, if
the primary root slice failed, the secondary would be used automatically
after a reboot.
As an experiment, I next performed the concat RAID 0 creation
on swap and /export/home via the command line (the metainit -f
commands that were not executable via the GUI). I then used the
GUI to create the mirror pairs. The wizard use was a bit tedious,
but selecting the correct options was straightforward. It helped
to have a list of the metadevice names and their related partitions
-- to help assure that a blank disk was not mirrored to its
filesystem-containing counterpart, for example. On the surface,
this method worked correctly. However, the GUI did not update /etc/vfstab
to tell the system to use the new mirror devices rather than the
original partitions. Making this change and rebooting the system
resulted in a fully mirrored root disk implementation. The final
GUI view is shown in Figure 2. The final /etc/vfstab is shown
in Figure 3.
Conclusions
SVM appears to be a nice leap forward from the old DiskSuite.
Inclusion by default on a system makes the facility more seamless.
Awareness of SVM by Solaris installation tools should allow much
simpler systems administration of SVM-managed disks. SVM is easy
to get started with (especially with the available GUI), but root
disk mirroring is still a manual and detail-oriented task. Performing
non-root-disk tasks should be easier. In my simple testing, it worked
and performed well.
Extensive testing within your environment would be a great step
toward full use of SVM on your systems. The soft volume feature
adds a lot of flexibility and is worth exploring. For example, you
could take a RAID 1 set, and create N soft partitions on it, one
for each user on the system. Should a user run out of space, you
could expand the soft partition to make more available.
Overall, SVM moves Sun's disk management tools from the realm
of small-number-of-disk systems to those with a moderate number
of disks. I would still hesitate to use it for terabytes of SAN-attached
storage, but it could handle gigabytes of locally attached storage
and save money by replacing Veritas Volume Manager in those instances.
Good Book Alert
Well, O'Reilly has done it again with the new edition of
The Networking CD Bookshelf. Version 2.0 contains a CD that
includes the full contents of seven O'Reilly & Associates
books: TCP/IP Network Administration 3rd ed; Managing
NFS and NIS 2nd ed; Network Troubleshooting Tools; SSH,
the Secure Shell: The Definitive Guide; Essential SNMP;
DNS and Bind 4th ed; and Building Internet Firewalls
2nd ed. I find the books on Firewalls and NFS especially good, with
useful reference material in the others. The "Book" also
includes one printed book, TCP/IP Network Administration.
My guess is that the printed book is the best seller of this lot,
but that is just a guess. The books are all in HTML format with
embedded images. All are accessible via a browser home page showing
all the books and providing a search feature that will search all
the books via one button. Books can be searched separately as well.
I found no browser-related issues, as the contents worked fine in
MS Explorer, as well as Mozilla 1.0.
The cost is on the high side for a book ($119.95 US list), but
considering all the content and the utility of having the contents
electronically and searchable, it's a bargain. The only potential
downside is the license agreement that essentially allows access
to the contents by only one user. The contents cannot be centrally
located and remotely accessed by multiple people, for example, without
buying more licenses. For personal use, however, I loaded a copy
of the contents on my home Linux server, which I can access remotely,
on my home network, or via my wireless network. I also put a copy
on my laptop, and even onto the 1-GB CF drive I use on my PDA. The
content is only 50 MB, so this is quite reasonable. It's nice
to know that I have all that great information available no matter
when I am. Overall, The Networking CD Bookshelf 2.0 is a
reasonable price for a great set of books provided in the most useable
manner possible. Highly recommended.
Peter Baer Galvin (http://www.petergalvin.org) is the
Chief Technologist for Corporate Technologies (www.cptech.com),
a premier systems integrator and VAR. Before that, Peter was the
systems manager for Brown University's Computer Science Department.
He has written articles for Byte and other magazines, and
previously wrote Pete's Wicked World, the security column,
and Pete's Super Systems, the systems management column for
Unix Insider (http://www.unixinsider.com). Peter is
coauthor of the Operating Systems Concepts and Applied
Operating Systems Concepts textbooks. As a consultant and trainer,
Peter has taught tutorials and given talks on security and systems
administration worldwide.
|