Cover V10, I04
Article
Figure 1
Figure 2

apr2001.tar


Quick and Dirty Server/Workstation Replication Using ufsdump

Ben Diamond and Keil Wurl

Five minutes of Internet searching can reveal a myriad of articles and scripts on backup techniques and backup strategies. Sys Admin's archive alone hosts a multitude of great ideas like W. Curtis Preston's hostdump.sh script that appeared in Sys Admin's July 2000 issue. While many of these articles and scripts focus on the backup itself, few devote any time to restoring data or using a full backup image for system replication. This article will focus on using ufsdump to perform a full backup (0 level dump) of a Sun Solaris workstation or server and then perform a full restore on an identical, unformatted machine.

Our team was asked to create and document a simple, inexpensive procedure for creating an exact duplicate of one of our small internal Solaris servers to act as a cold spare, in the case of catastrophic failure. The server currently acts as an encrypting gateway and router. The data on the disk is fairly static. We knew we would also be asked to use this procedure to clone a few Solaris workstations. The key words "inexpensive" and "simple" limited our search for the perfect utility with tons of bells and whistles, and curtailed our foray into the creation of custom-burned Solaris jumpstart CD-ROMs. We decided to keep it simple and use the base set of UNIX utilities.

Every UNIX distribution comes with a number of backup utilities built into the OS. cp, tar, cpio, and dump are among the most common. Each of these utilities has advantages and disadvantages in terms of speed, reliability, portability, and operability, and each excels at different types of backup jobs. For full-system backups, dump wins out. This is especially true when portability to another operating system is not at issue. dump, as opposed to most other backup utilities, copies all system files, including special files like those created using mknod. It can be configured to follow symbolic links or just copy them as pointers. It also saves file properties such as owner, permissions, and access/modification/creation times and dates.

Armed with our choice of backup utilities, we wrote a simple c-shell script, DUMP2TAP, to pass the correct command-line options to ufsdump (Solaris's version of dump). The script has only a few configuration options and is fairly portable. With a few minor tweaks, it can be made to run under Linux as well. DUMP2TAP runs a full 0-level ufsdump to tape (meaning every file, including special files, links, etc.) on each filesystem that you specify. It handles rewinding the tape and ejecting the tape from the drive so you know when the dump is complete. (Listings for this article are available from the Sys Admin Web site: http://www.sysadminmag.com.)

How to Use DUMP2TAP

Before you begin, you must be the root user because a full backup requires access to all files. You will need to ensure that your system has a working version of the c-shell (/bin/csh or /bin/tcsh), although you do not need to be in csh to launch the script. You will also need a properly configured tape drive and tape media.

Use your favorite editor to set the appropriate device file in the line "set tape_device =". You must select a "no-rewind" tape device for our script to work properly. Note that there may be several tape device files for a given tape drive and that the amount of data you can fit on a tape is related to several factors, including the length of the physical tape media you select and capacity of the device file you select. For example, our Solaris server is configured with a PCI-SCSI card and an Exabyte 8-mm EXB-8500 tape drive. The EXB-8500 has a maximum capacity of 14 GB, but that is only if you use a 160-mm tape and invoke ufsdump with the device file /dev/rmt/0cbn ("0" determines the SCSI setting for the tape drive, "c" forces ufsdump to compress data as it writes to the tape drive, "b" denotes "Berkeley" or SunOS 4.x compatibility, and "n" for "no-rewind"). If we used a 112-mm tape and /dev/rmt/0mbn (medium density, no compression), we would end up with only 5 GB of capacity. Check with your tape drive vendor and read the man files on your tape device files for more information. When in doubt, start by checking out what device files are in /dev/rmt.

That's it for editing the script. Set the script's owner to "root" and the permissions to "755" and run the script from the command line.

The script can be run in two different fashions. Invoking the script with no arguments will display the script's usage and will show each filesystem and corresponding file system sizes.

# ./DUMP2TAP
Usage: DUMP2TAP filesystem(s)
Filesystems on this workstation are: (filesizes in KB)
/ 15789 /usr 209082 /var 181917 /export/home 9 /home1 416944 /opt 223138
Alternatively, the script can be run with an argument to perform a dump of all filesystems.

#DUMP2TAP all
As a final alternative, the script can be run with arguments that indicate "specific" filesystem you wish to back up.

# DUMP2TAP / /usr /var /export/home /home1 /opt
The script will dump each filesystem to tape without rewinding the tape between filesystem dumps. This is why a "no rewind" device file is needed. The script will notify you when it's complete and will rewind and eject the tape. If all of your filesystems are too large to fit onto one tape, you can use this script with multiple tapes, specifying different filesystems for each instance. In either case, be sure to record the exact sequence of arguments you used to invoke the backup, because you will need to know that sequence during the restore process.

The DUMP2TAP script can be run while the workstation or server is active on the network, but it is best to boot down to single-user mode before initiating any sort of dump. This will prevent network users and system processes from modifying files as you are backing them up.

The last step of the backup is to get a printout of your current filesystem partition sizes. Later, we will use the ufsrestore command to build the clone system. The ufsrestore command does not partition the disk for you; you must do that yourself using the format command. To use format, you must know how much disk space to assign to each partition on the new system. So, at a command prompt, type:

# df -k
Write the information down or get a screen print. The relevant information is in the columns labeled "Filesystem", "kbytes", and "Mounted On".

# df -k                                                    
Filesystem            kbytes    used   avail capacity  Mounted on  
/dev/dsk/c0t0d0s0      92231   15789   67222    20%    /           
/dev/dsk/c0t0d0s6     384120  209082  136628    61%    /usr        
/proc                      0       0       0     0%    /proc       
fd                         0       0       0     0%    /dev/fd     
/dev/dsk/c0t0d0s3     524141  181570  290161    39%    /var        
/dev/dsk/c0t0d0s7      96019       9   86410     1%    /export/home
/dev/dsk/c0t0d0s4    2210623  416681 1572882    21%    /home1      
/dev/dsk/c0t0d0s5     288556  223122   36584    86%    /opt        
swap                  398584      24  398560     1%    /tmp
Write down your root password and save it with the printed output and the backup tape. You'll need this later too.

How to Restore from the dump Tape

1. Configure the hardware on the clone machine and attach the tape drive. Insert your dump tape into the tape drive.

2. Turn on the machine. If this is a new, non-pre-configured Sun system, it will ask whether you want it to automatically load Solaris 7 or Solaris 8, using default settings. We could install one of these OS's, reformat each partition manually, and then restore our system from tape, but there is an easier way to reformat new, out-of-the-box systems, as well as existing systems. We will perform an interactive installation using a Solaris CD-ROM. Hit "Stop" and "a" on the keyboard at the same time. This halts the Solaris OS and brings you to the boot-prompt (">"). Insert a Solaris boot CD and close the CD tray. At the boot prompt, type:

> boot cdrom - install
The system will reboot off of the CD-ROM and begin installing Solaris 2.x (see Figures 1 and 2).

The interactive installation process will now guide you through configuring the new machine. Click "Continue" through the screens to set up local time and date, time zone, system name, etc. You will next be asked whether this will be a networked or non-networked machine. Select "non-networked" machine for now, as this will spare you from filling in a few extra screens. When asked if this installation is an Upgrade or Initial, select "Initial." Select "Standalone" on the "System Type" dialog screen. On the next screen, you will be asked to choose a "software group" to install. Select the "End User System Support" software group and hit "Continue".

All of the information up to this point (including network settings, system name, time zone, etc.) will later be overwritten and modified to match our original system when we restore from tape. We've just been going through the motions to get to the disk partitioning screen.

Choose your disk drive from a list of available disks (presumably there is only one drive in the new system at this point). Hit "Continue" again and bypass the "Preserve Data" option. You should be at the disk partitioning section. Choose "Manual Layout" to perform custom partitioning. Here is where we use the df -k output from earlier. Manually set each partition up to match or accommodate the partitions of our original machine. Note that our original system uses a 4.01-GB disk drive, while the new system came with an 8.2-GB drive. We need to make each partition as big -- or bigger -- than its counterpart on the original system. Instead of just doubling each partition size to account for the 2X increase in disk size, we've chosen to allocate much of the additional space to partitions that were nearing capacity on the original system, or are expected to grow significantly. Example of sizing partitions on the new system, assuming an 8.2-GB disk drive:

Slice  Partitions    Size in MB (original)  Size in MB (new disk)
0      /             92.2                  400
1      swap          398.5                 1400
2      overlap       4012 (whole disk)     8222 (whole disk)
3      /var          524                   1400
4      /home1        2210.6                2500
5      /opt          288.5                 1200
6      /usr          384                   1220
7      /export/home  96                    100
Input the second and fourth columns from the table above into the partitioning table on the screen. The partitioning screen will not allow you to allocate more space than is available, and it will show you how much space you have remaining as you tab through each field. When you have completed this step, hit "OK" and then "OK" again on the next screen to enable an automatic reboot after completion. The system will now partition the disks for you and install some base OS files. Grab a cup of coffee -- this will take about 20 minutes.

3. When the interactive installation is complete, your system should be correctly formatted and sized. Now we need to wipe out all of the data on each partition and restore your original system data. The system should automatically reboot after the interactive installation. Log on to the new system as root and type:

# init 0
This will bring you to boot-prompt mode. At the boot prompt, type:

> boot cdrom -sw
This time, no "- install". We will now be running the OS off the CD drive. Temporarily ignore any errors reported on screen about the network interface and graphics display (M64 fifo errors).

In the steps below, we will mount each of the partitions on the hard-disk, one-by-one, to a temporary space called /a, and wipe them clean. The partitions in our example below are named c0d0t0sX, where X = 0 through 6, for a total of seven (7) partitions. (The name of the disk will vary from system to system, though it is easily obtained from the output of a format command.) Here we walk through preparing one of these partitions for the restore procedure:

# newfs /dev/rdsk/c0t0d0s0
# Construct a new file system for /dev/rdsk/c0t0d0s0 ? 
# (Answer Yes or Y)Y
Run this newfs command for the following slices:

c0t0d0s0
(Slices 1 and 2 are swap and overlap. There's no need to do a newfs on c0t0d0s1 as it will remake itself upon each boot. Do not do a newfs on c0t0d0s2!!!)

c0t0d0s3
c0t0d0s4
c0t0d0s5
c0t0d0s6
c0t0d0s7
4. Next, restore all partitions using ufsrestore. Remember that the DUMP2TAP script dumped the original partitions on the dump tape in a very specific order:

/ /usr /var /export/home /home1 /opt
Each of these partitions must be restored separately, and in order, to their respective partitions on the server's hard disk. Again, we mount the partition to /a, change directories into /a, issue a ufsrestore command, remove any "symlinkstable" files that get created during the restore, change directories out of /a, and unmount. Repeat these steps for each partition. The ufsrestore command will take an argument and an incrementing numerical value to tell the tape drive to "skip to" (-s) the needed section of the dump tape, which corresponds to the partition we are attempting to restore. (Note that the number after -s ("skip to") is related to the order in which partitions were written to the dump tape. They do not necessarily equate to the number of the disk slice. For example, if I skip to the second section of my dump tape, that would be the /usr partition. /usr is not slice #1 on my disk, it is slice #6, as in c0t0d0s6.)

Tar Tape Position     Slice #      Partition Name
1                     c0t0d0s0     /
2                     c0t0d0s6     /usr
3                     c0t0d0s3     /var
4                     c0t0d0s7     /export/home
5                     c0t0d0s4     /home1
6                     c0t0d0s5     /opt
For the restore operation, therefore, type:
# mount /dev/dsk/c0d0t0s0 /a
# cd /a
# ufsrestore rvfs /dev/rst/0cbn 1
# rm ./symlinkstable
# cd ..
# umount /a
Then, increment the tape position number and change the slice number accordingly:

# mount /dev/dsk/c0d0t0s6 /a
# cd /a
# ufsrestore rvfs /dev/rst/0cbn 2
# rm ./symlinkstable
# cd ..
# umount /a
and so on until you have done all six tape positions.

5. When you've finished looping though all of the partitions, you should have a bootable, and fully configured, cloned Solaris machine. Bring yourself down to boot-prompt mode via "init 0". Eject the CD-ROM, and at the boot prompt, type "boot".

> boot
Depending on the factory configuration of the new Ultra, you may now encounter a problem: the drive won't boot! Our old Ultra was factory preconfigured to boot from disk 0, but I have found that new Ultra's are factory preconfigured to boot from disk 1. You can test this by typing:

>boot disk0
then, your system should boot. If it does, then the problem is simple to fix. Bring yourself down to boot-prompt mode via "init 0". At the boot prompt, the following command will permanently set nvram to boot from disk 0:

> setenv boot-command disk0
> reset
6. Now the system is booted and should be up and running. Make sure that the original system and the cloned system are not both on the network now, because they may both have the same IP address. The default root password used just before you created the dump tape will be the same for this new system. Try logging in as root.

If this system is intended as a "cold spare" machine, then you can boot down and put the system away for the day you hope will never come.

7. If you intend to deploy this system on the network, you will need to modify some, if not all, of the following files (assuming you are using static IP addressing and host naming):

/etc/inet/hosts
/etc/inet/resolv.conf
/etc/inet/defaultrouter
/etc/inet/defaultdomain
Change the name of the system, where "newname" is the new name you want for this system:

# hostname newname
Edit the /etc/hostname.hme0 file to reflect the new system hostname. Be sure that this exact name is used in /etc/hosts on the "loghost" line.

You may also want to consider changing root's password before you reboot.

#init 6
One last note: The dump tape can be used to restore back to the original machine or to create a clone of the original machine. Cloning works best if the clone machine has the same hardware as the original machine. Of course, this isn't always possible. In our case, we purchased the clone machine off the same specifications sheet from the previous machine. However, in the 6 months that had passed, the processor had gotten twice as fast, the disk drive doubled, and the price fell. Some of the internal chips had changed as well.

After our cloning extravaganza, we kept getting an awful error message on the system console of the new machine each time we booted ("M64 fifo error, bus_reg=7b23a040"). We found that the graphics chipset had changed slightly between the two machines and that a simple patch from http://www.sunsolve.com was needed (patch ID #103792).

Conclusion

This article outlined a relatively simple, inexpensive procedure for cloning a small workstation or server. This worked well for us as a quick and dirty method to replicate one of our in-house systems. The procedure solves only a very small subset of possible backup requirements. Although the script can be modified to accommodate other needs, it may be more fruitful to search the Net for other utilities. It shouldn't be hard to find something that matches your budget, time, and technical requirements.

References

System Administration Guide, Volume I, November 1995, available from http://www.sun.com or docs.sun.com.

The UNIX C Shell Field Guide by Gail Anderson and Paul Anderson. Prentice Hall, 1986.

Ben Diamond is the Director of Information Technology at Health Care Compliance Strategies, which authors high-end multimedia CBT training for the healthcare industry. He holds a BS in Labor Relations from Cornell University and an MA in English from the University of Maryland. Ben currently manages an array of Sun servers which support over 50,000 CBT users. He can be reached at: brd@hccsonline.com.

Keil Wurl is a Senior Information Technology Specialist at IBM providing technical sales support for IBM's Printing Systems Division. His primary focus is supporting IBM's Infoprint 2000 high speed network printing systems which uses Sun print servers. He holds a BS in Chemical Engineering from the State University of New York at Buffalo, and a Master of Engineering from Cornell University in Operations Research. He can be reached at: kwurl@us.ibm.com.