Cover V02, I04
Article
Figure 1
Figure 2
Figure 3
Figure 4
Listing 1
Listing 2
Listing 3
Listing 4
Listing 5
Sidebar 1

jul93.tar


User Report: CTAR and Company

Leor Zolman

Most standard Unix and XENIX distributions include several rudimentary tools useful for backup and archiving of data. These tools come in two broad categories: image-copy utilities and file-by-file utilities.

The image-copy category includes programs such as dd and volcopy. These utilities move bytes along very rapidly between one filesystem and another, or between a filesystem and a secondary storage device such as a tape drive. They do not, however, support a great deal of file selectivity in the process.

tar vs. cpio

The file-by-file category features utilities such as cpio (mnemonic for "Copy-In, Copy-Out") and tar (mnemonic for "Tape ARchiver"). These programs view a data stream as a collection of individual files, and thereby offer much greater control over file selection during backup and restore operations. You can implement a viable system backup strategy using either of these utilities or their variants.

I've found that system administrators tend to favor either tar or cpio to the practical exclusion of the other. I, for example, have always used tar, and only dust off cpio when I have no alternative (such as when someone hands me a disk written in cpio format). I probably prefer tar because the command-line format is simpler; cpio requires the user to generate a list of filenames with help of a utility such as find, and then to pipe the generated list of files to the cpio program. With tar, I find it easier to compose commands for tasks such as "Write everything in this directory and below to the tape drive" and have those commands work right the first time.

tar, even with its more convenient command-line syntax, has its idiosyncrasies. Back when I used vanilla tar for system backups, I could not persuade it to write out empty directories. This had a particularly nasty effect when backing up the root partition, which contained the lp subsystem, which contained an (empty) spool directory for each of our umpteen different system printers.

Exit tar

What finally prompted us to look beyond vanilla tar was the plain old issue of data bloat. Our tape drive was an Archive 150-Meg QIC, while our hard-disk capacity had grown to around 900 Mb on-line. We had limited the size of disk partitions to 140 Megs so they would each fit on a single tape cartridge and allow unattended nightly backups of at least one filesystem per night. The problem, however, was that one filesystem per night had become insufficient. We had critical data on several filesystems: we needed to fully back up the main database, composed mostly of huge data files, every night (since an incremental backup would involve around 90 percent of the data anyway), and the remaining filesystems needed at least incremental backups on a nightly basis. There's no way tar could fit that much data on a single one of our tapes.

Enter CTAR

Seeing an ad for UniTrend's CTAR package, a "tar with compression," I immediately gave them a call and found a software-only solution to our backup requirements. That solution actually involves two products: CTAR for the creation of compressed archive volumes, and an unadvertised utility named dataset. CTAR's compression feature made it possible to fit the quantity of data we needed backed up daily onto a single tape. However, we still needed a way to write a number of individual backups, some full and some incremental, onto a single cartridge tape.

And Its Trusty Sidekick, dataset

If we had been forced to write everything out onto a single tape volume, and told CTAR (or any backup program) to back up everything starting from the root, then we would have been limited to a single full or incremental backup of everything, with no picking-and-choosing between filesystems. The dataset utility solves this dilemma by supporting multiple logical volumes on a single physical cartridge tape. Using dataset to position the tape between successive invocations of CTAR, a single tape can receive a mixture of full and incremental backups without any operator intervention.

dataset is a very simple command, in every sense of the word. Its usage is:

dataset seq-no block-size

where seq-no is the sequence number of the logical volume desired (1,2,3, etc., though not always - see below), and block-size is the block size of the tape device. On which device does dataset perform the seek? It is hard-wired to use device /dev/nrct0, which may be (and ought to be) a link to the backup tape device.

dataset always works by starting at the beginning of the device and seeking past (seq-no -- 1) logical EOT (End-Of-Tape) markers to arrive at the desired logical volume position. Since CTAR (and tar) always conclude their runs by rewinding the tape, this means that they spend much of a multi-volume backup repeatedly seeking and rewinding the tape. Those Archive drives must be pretty robust, however, since the same unit has now been performing one full and six incremental dumps per night every night for the past three years without maintenance. Plus, in all that time only about three or four of the twenty-odd tapes we use in rotation have bitten the dust.

On some UNIX systems (SCO's entire stable, for example), tape write operations are always terminated by two EOT markers instead of just a single one. dataset is not especially intelligent about this, so we have had to use consecutive odd integers instead of simple consecutive integers to address the logical backup volumes of our SCO XENIX and UNIX systems. This is a minor annoyance when compared to the immense flexibility multiple logical volumes provide.

When a backup procedure can take three or four hours (again, mostly due to the seek-time overhead described above), even when scheduled to take place in the dead of night, an administrator must consider the possibility of some data getting changed "out from under" the backup process. In the premiere issue of Sys Admin (May/June 1992), I described the spooling overnight job scheduler I designed for our company's UNIX system. With a substantial number of variable-length jobs executing sequentially each night, running a backup process at a fixed hour was not a good idea: if we had to restore from a crash the next day, it would always be questionable whether the versions on the tape were newer or older than the ones on disk. The answer was to set up our backup driver script as "just another" overnight spooler job, albeit the one with the lowest priority. Now, the backup process doesn't begin to happen until all the other jobs have finished running.

To provide an example of how CTAR and dataset work together, Listings 1-5 show the actual driver scripts I wrote to perform our backups. Several times a day (for redundancy, in case the system is down during one of the scheduled run times), spooldumps.sh (Listing 1) runs from the cron table to spool dump.sh (Listing 2) for overnight execution at the lowest possible priority. No other jobs are ever spooled at priority 7, to ensure that dump.sh runs only after all other overnight tasks have finished. With /u3/Backup as the working directory (set by the cd statement in spooldumps.sh), dump.sh provides some logging around the main backup driver script, dailybak.sh (Listing 3).

To keep a complete record of all the night's backup activities in a single cumulative file, dailybak.sh makes use of CTAR's automatic logging mechanism. The environment variable CTARFILE specifies the name of the file to which CTAR will write its logging information. After each backup step, dailybak.sh appends the contents of the log file just written to a cumulative file named by INCRLOG.

dailybak.sh maintains the four previous cumulative logs in FIFO sequence, with INCRLOG.2 the next most recent and INCRLOG.5 the oldest. dailybak.sh positions the tape for each partition's backup using dataset, and calls one of two scripts to perform the actual backup: incrbak.sh (Listing 4) for the incrementals or fullbakp.sh (Listing 5) for the full dump of /u4. The filesystems we're most likely to want to recover files from are saved first on the tape for quickest access. /u is the filesystem containing users' home directories; since accidental user file deletion is the most common reason for accessing the backup tape, we back up /u in the first logical tape volume.

To perform full backups, two variations of the driver script are available (as links). fullbakp.sh, as invoked by dailybak.sh, uses CTAR's data compression facility, while the other link, fullbak.sh, does not use compression. We use compression during the nightly backup to fit everything onto a single tape. Individual full backups, usually performed manually on the weekends, are written without compression (whenever there is enough room on the tape) for maximum speed. With CTAR 3.4's new S option for double buffering, however, along with generally increasing CPU speed, there is no longer all that much speed difference between compressed and uncompressed backups. They all go about as fast as the hardware can support.

To wrap up this case study, let's take a look at the command lines in incrbak.sh and fullbak.sh that actually invoke CTAR. First, the OPTIONS variable is set to the option list appropriate for each variation. The options used in every case are: c, to create an archive, 8, to select the device; and V to create the log file named by the CTARFILE environment variable. The entry for device 8 in our /etc/default/ctar configuration file happens to be:

archive8=/dev/erct0  120  150000  y  # error correcting tape

so CTAR gets the device name, block, and tape size parameters right out of the file. The format of this configuration file is exactly the same as that for standard tar's configuration file, /etc/default/tar.

The option list for incremental backups in incrbak.sh also includes I, to specify incremental, and P, to force data compression. When P is specified, CTAR (by default) compresses all non-executable files that are at least 15 blocks in length, except for those files ending with standard compressed-file extensions (e.g., .Z, .zip, etc.) Many additional options are available to control compression criteria, including the ability to exclude certain file extensions or entire directory trees from being compressed. If you have compressed files with non-standard extensions, however, and you don't specify all the right options to inform CTAR about them, your files could conceivably end up with negative compression. You can usually avoid that, however, through proper use of the compression-related options.

For full backups via fullbak.sh, the option list includes M for "Master backup." This instructs CTAR to create a log file named ./etc/Master_backup in the root directory of the filesystem being backed up. This log file contains the date of the backup, and is used by CTAR to determine which files to back up in subsequent incremental backups of the same filesystem. The M option also automatically triggers certain other options appropriate for a master filesystem backup, such as R, which will lock files (against modification by other programs) while they are being backed up.

Both incrbak.sh and fullbak.sh handle a root filesystem backup differently than those for other filesystems. If the filesystem being backed up is not the root, both scripts invoke a cd command before CTAR to perform the backup relative to the individual filesystem's own root directory. When backing up the system root, however, the other filesystems are still mounted; if we didn't take steps to prevent it, CTAR would attempt to write the data from all mounted filesystems to tape. To limit the root backup to data on the actual root filesystem, we use CTAR's "exclude" option to specify the mount points for all non-root filesystems. For example, the command sequence:

cd /
ctar MV8EEEEEE /u /u1 /u2 /u3 /u4 /u5 .

would create a Master backup (c is one of the options implied by the M option), with logging, to archive device 8, and include all files and subdirectories of the root except /u through /u5. Each E in the option string matches up with one of the list of directories to be excluded. The final . in the command line specifies the base directory for the backup, which in this case is the system root.

Crash Protection with Airbag

We ran XENIX for a long time before moving up to UNIX, and the version of CTAR UniTrends distributes for XENIX came bundled with a crash-recovery package named Jet RestoreEase. The UNIX version is called "System Crash Airbag," but it is still essentially the same software. Despite the awkward appellations, this package did an admirable job of integrating CTAR and some custom scripts onto a highly robust boot/root disk set for use in disaster recovery.

Airbag's quality as a root/boot generator becomes apparent when you first go to create the disk set. Airbag supplies exhaustive annotation of exactly what it is doing during each nitty-gritty step of the process. You can see each file as it is being written to the floppy, watch the kernel get automatically compacted and compressed, and see how much space is available at every juncture. Airbag asks you questions about your system and customizes the boot set accordingly. It even suggests how to verify the integrity of the boot set and master backup tape without jeopardizing the integrity of your existing system.

Airbag in Action

I now run UNIX at home on my "old" machine, a 386-25 with a Colorado Jumbo 250 tape drive running off the floppy controller. I've been having some problems lately with this system's two old ESDI drives, and several bouts with hard-disk errors have resulted in a trashed filesystem. Since I never really trust a drive after a crash unless I can get it to successfully perform a low-level format, I've had plenty of opportunities to let my Airbag/CTAR backup rebuild my entire system for me.

The very first time I had to do this on my system, it had been quite a while since the last time I'd gone through the process, and that was under XENIX with Jet RestoreEase. I was tired, frustrated and impatient; I concluded prematurely that Airbag wasn't going to work, and re-installed the entire UNIX system from the original installation disks and various masters and backup disks. I created a new boot/root set and master backup tape; when my hard drives went bonkers again soon thereafter, I created a new Airbag disk set and realized why my efforts the first time had met with failure. Airbag is hard-wired to restore from tape device /dev/rct0 with a blocking factor of 120 by default. When Airbag asked me for the device name, I had looked at it and presumed that default to be correct; actually, all I had needed to do was to override the default with the actual device name of the Colorade tape drive on my system, /dev/rjt0, and give it a blocking factor of 20. When I'd accepted the defaults of /dev/rct0 and 120, of course it couldn't read my drive!

I've suggested that Airbag be made smart enough to know the correct default tape device for the particular system on which the boot/root disk set was created.

Floppy-Based Tape Decks Redeemed by Airbag

During creation of the boot/root set, Airbag asks whether or not the tape drive is running off the floppy controller. This is an important question, since it seems that with floppy-based tape decks it is impossible to boot off a floppy disk and then perform a restore from the tape drive when the active system is still running off the floppy disk. To use the OEM boot/root software that came bundled with my floppy-based tape drive, I first had to obtain a dedicated controller card. Airbag, however, provides a way around this whole problem without requiring the purchase of a dedicated tape controller card.

From my experience, here's how the process goes when you have to rebuild an entire UNIX system from a floppy-based tape deck onto a newly formatted hard disk, given an Airbag/CTAR backup disk set:

1. Boot off the A1 boot floppy, which involves a cold boot to the standard SCO boot prompt, and then pressing the Return key to load the kernel from floppy. When the kernel is in memory, it asks for the root floppy, A2.

2. Insert A2 and press Return to complete the boot process. This puts you at the main Airbag menu (see Figure 1).

3. Choose option #4 to restore to a re-initialized disk drive.

4. At this point, Airbag takes you through much the same sequence of utilities as the standard XENIX/UNIX installation does: enter hard disk parameters, configure the partition table, perform bad tracking, set up block-by-block logical partitions, etc. Whenever practical, Airbag tells you how the system was set up in its last incarnation, so you can reproduce the same exact configuration. If you want to change it, however, you are free to do that.

5. Once all the preliminaries are out of the way, Airbag announces it is about to set up to "Restore the Master Backup for Floppy-Based Tape"; this process involves setting up an entire bootable system image on the hard disk, but in such a way that no harm comes to the still-possibly-valid data on the hard disk (if any). (The precise manner in which Airbag accomplishes this feat is proprietary, but any halfway-conscious system administrator ought to be able to figure out how it's done from the messages generated by the software.) Setting up for re-boot takes about three minutes.

6. At this point, Airbag takes you through about six screens of instructions, cautions, and warnings all boiling down basically to this: wait for the system to shut down, plug the A1 floppy into the drive, boot it, and enter the string "again" when the boot prompt appears. That isn't very complicated to say, but in its zeal to be ultra user-friendly, Airbag managed to make me feel so secure that I followed its instructions literally and ended up with a corrupted A2 root floppy. The last screen of the message sequence says, at the very top, "Remove disk A2 and insert boot disk A1." When that screen pops up, however, the system is still writing some information onto disk A2. If you wait long enough, eventually the floppy is unmounted and the "Safe to Reboot" message appears; however, the warning to wait for that message before removing the floppy is, at this point, several screens back in Airbag's message sequence, and by the time I'd reached the final screen I had forgotten about that warning. So, I proceeded to do exactly what it said: I switched floppies, without looking too hard at the busy light on that drive. Of course, we all know what happens when you remove a mounted floppy in the middle of an I/O operation! In all fairness, Airbag does warn you to create an extra copy of the A2 root disk in case something like this happens. I, naturally, had ignored the warning.

7. After waiting for the shutdown message and rebooting from the A1 boot floppy, the SCO boot prompt appears. At this point, instead of hitting the usual Enter key, type "again". This instructs the boot program to read the kernel off the floppy, but use the magic temporary area on the hard disk for the root filesystem. After the kernel "wakes up" with a hard-disk-based root filesystem, a program runs that asks you to mount the backup tape in the tape drive and prompts for the tape device and blocking factor, with /dev/rct0 and 120 the defaults, as mentioned previously.

8. After a few more questions - such as "Do you want to overwrite existing data?" and "Does this backup span multiple volumes?" - the tape drive fires up and the Master root backup is restored. After that, it asks if you have any incrementals you want to restore. If so, it presumably loads those in as well (I've never had any to restore, but there's no reason at this point to suspect it wouldn't work) and shuts down when finished.

9. Now make sure there's no floppy in the drive, and boot up normally. What you should see is your perfectly restored UNIX system, just as it was at the time of the last tape backup.

If your tape drive has its own dedicated controller, then the procedure is the same except that you'll skip steps 5-7. If you're not worried about the integrity of the drive, you can skip all the low-level configuration utilities by choosing selection #1 from the main menu, and let Airbag automatically set everything up for you exactly the way it was before. I haven't seen that one work, because all my restore attempts have come after pretty catastrophic crashes, but I have no reason to doubt that it would work.

If you don't want to jump right in and attempt a tape restore after booting off the Airbag disks after a crash, you can invoke a shell from the main menu (option #7) and poke around a bit first to see how bad the damage was. Also, you can bring up the Utilities menu (Figure 2) and attempt to repair the disk yourself. This might be necessary if you've made some significant filesystem or system configuration changes since writing out your last root filesystem backup and/or creating your most recent Airbag disk set.

Other CTAR Features

When you enter the CTAR command with no arguments, the help screen shown in Figure 3 displays. As you can see, there are lots of options and features, only a few of which I describe here.

CTAR supports a limited use of wildcards during restore operations. The wildcards must appear at the start or the end of a string, but not in the middle. This is still pretty useful for selective restores.

The Z option forces compression of executable files. By default, CTAR does not compress executables.

The N option specifies a non-destructive restore. This tells CTAR to restore only files that do not already exist on disk.

When you press the INTR key (usually DEL) during a CTAR backup, it doesn't just abort; instead, you're presented with a mini-menu (Figure 4), and you get to select whether to "really abort," finish the current file and then abort, or just continue with normal operation. This is handy, especially when you accidentally hit the INTR key after most of a lengthy backup!

Normally, when a program runs in the background, you can't interact with it. CTAR, however, provides a way to let you talk to it when it is running in the background. If some kind of error occurs and CTAR requires interactive input, it sends its prompt message to standard output (nothing unusual there), but it reads its input from a special device called /dev/ctar_listen. Thus, if CTAR asks you a question requiring a yes or no response, you just send your response to that device by saying:

echo y > /dev/ctar_listen

and CTAR will hear it while running in the background.

CTAR provides full support for virtual files, including the ability to restore them in their virtual entirety, "holes" and all. However, you must prepare a text file containing the list of all such virtual files, and tell CTAR the location of that file via the VIRTUAL_LIST environment variable. CTAR won't detect virtual files without this assistance.

The Menu System

A recent addition to the CTAR package is a full-screen interactive menu interface. Invoked by the ctmenu command, the menu interface allows users to compose and catalog customized CTAR command strings and run them on demand, configure the device library and unattended backup schedule, invoke utilities such as dataset (if present), view and/or purge CTAR log files, and generally provide a user interface that shields users from the command line use of CTAR. A tech support sub-menu helps you prepare various kinds of problem report sheets to FAX to UniTrends should you happen to need support (phone support is also available at no additional charge).

With a command as versatile as CTAR, it can take some time to figure out the right combination of options when you're trying to do something unusual by hand. The menu system is handy for anyone who would like to perform such backups without having to browse through the reference guide for all the needed options.

In Summary

CTAR, dataset, and Airbag/Jet RestoreEase have been an essential part of my system administrator's arsenal for several years, and I'd be really worried if I didn't have a current Airbag backup of my UNIX system. Under XENIX, the combination of CTAR/Jet RestoreEase and a floppy-based tape deck provides cheap, easy, and reliable backup security (CTAR and Airbag cost slightly more for UNIX, where they do not come bundled).

About the Author

Leor Zolman wrote BDS C, the first C compiler targeted exclusively for personal computers. Leor is currently an instructor on UNIX topics for Boston University's Corporate Education Center, a regular contributor to The C Users Journal and Sys Admin magazines, and "Tech Tips" editor for Windows/DOS Developer's Journal. His first book, Illustrated C, was recently published by R&D Publications, Inc. He may be contacted at 74 Marblehead St., North Reading, MA 01864, or on Usenet/Internet as: leor@bdsoft.com.