User Report: CTAR and Company
Leor Zolman
Most standard Unix and XENIX distributions include several
rudimentary
tools useful for backup and archiving of data. These
tools come in
two broad categories: image-copy utilities and file-by-file
utilities.
The image-copy category includes programs such as dd
and volcopy.
These utilities move bytes along very rapidly between
one filesystem
and another, or between a filesystem and a secondary
storage device
such as a tape drive. They do not, however, support
a great deal of
file selectivity in the process.
tar vs. cpio
The file-by-file category features utilities such as
cpio (mnemonic
for "Copy-In, Copy-Out") and tar (mnemonic
for "Tape
ARchiver"). These programs view a data stream as
a collection
of individual files, and thereby offer much greater
control over file
selection during backup and restore operations. You
can implement
a viable system backup strategy using either of these
utilities or
their variants.
I've found that system administrators tend to favor
either tar
or cpio to the practical exclusion of the other. I,
for example,
have always used tar, and only dust off cpio when I
have no alternative (such as when someone hands me a
disk written
in cpio format). I probably prefer tar because the command-line
format is simpler; cpio requires the user to generate
a list
of filenames with help of a utility such as find, and
then
to pipe the generated list of files to the cpio program.
With
tar, I find it easier to compose commands for tasks
such as
"Write everything in this directory and below to
the tape drive"
and have those commands work right the first time.
tar, even with its more convenient command-line syntax,
has
its idiosyncrasies. Back when I used vanilla tar for
system
backups, I could not persuade it to write out empty
directories. This
had a particularly nasty effect when backing up the
root partition,
which contained the lp subsystem, which contained an
(empty)
spool directory for each of our umpteen different system
printers.
Exit tar
What finally prompted us to look beyond vanilla tar
was the
plain old issue of data bloat. Our tape drive was an
Archive 150-Meg
QIC, while our hard-disk capacity had grown to around
900 Mb on-line.
We had limited the size of disk partitions to 140 Megs
so they would
each fit on a single tape cartridge and allow unattended
nightly backups
of at least one filesystem per night. The problem, however,
was that
one filesystem per night had become insufficient. We
had critical
data on several filesystems: we needed to fully back
up the main database,
composed mostly of huge data files, every night (since
an incremental
backup would involve around 90 percent of the data anyway),
and the
remaining filesystems needed at least incremental backups
on a nightly
basis. There's no way tar could fit that much data on
a single
one of our tapes.
Enter CTAR
Seeing an ad for UniTrend's CTAR package, a "tar
with compression," I immediately gave them a call
and found a
software-only solution to our backup requirements. That
solution actually
involves two products: CTAR for the creation of compressed
archive volumes, and an unadvertised utility named dataset.
CTAR's compression feature made it possible to fit the
quantity
of data we needed backed up daily onto a single tape.
However, we
still needed a way to write a number of individual backups,
some full
and some incremental, onto a single cartridge tape.
And Its Trusty Sidekick, dataset
If we had been forced to write everything out onto a
single tape volume,
and told CTAR (or any backup program) to back up everything
starting from the root, then we would have been limited
to a single
full or incremental backup of everything, with no picking-and-choosing
between filesystems. The dataset utility solves this
dilemma
by supporting multiple logical volumes on a single physical
cartridge tape. Using dataset to position the tape between
successive invocations of CTAR, a single tape can receive
a
mixture of full and incremental backups without any
operator intervention.
dataset is a very simple command, in every sense of
the word.
Its usage is:
dataset seq-no block-size
where seq-no is the sequence number of the logical
volume desired (1,2,3, etc., though not always - see
below), and
block-size is the block size of the tape device. On
which device
does dataset perform the seek? It is hard-wired to use
device
/dev/nrct0, which may be (and ought to be) a link to
the backup
tape device.
dataset always works by starting at the beginning of
the device
and seeking past (seq-no -- 1) logical EOT (End-Of-Tape)
markers to arrive at the desired logical volume position.
Since CTAR
(and tar) always conclude their runs by rewinding the
tape,
this means that they spend much of a multi-volume backup
repeatedly
seeking and rewinding the tape. Those Archive drives
must be pretty
robust, however, since the same unit has now been performing
one full
and six incremental dumps per night every night for
the past three
years without maintenance. Plus, in all that time only
about three
or four of the twenty-odd tapes we use in rotation have
bitten the
dust.
On some UNIX systems (SCO's entire stable, for example),
tape write
operations are always terminated by two EOT markers
instead
of just a single one. dataset is not especially intelligent
about this, so we have had to use consecutive odd integers
instead
of simple consecutive integers to address the logical
backup volumes
of our SCO XENIX and UNIX systems. This is a minor annoyance
when
compared to the immense flexibility multiple logical
volumes provide.
When a backup procedure can take three or four hours
(again, mostly
due to the seek-time overhead described above), even
when scheduled
to take place in the dead of night, an administrator
must consider
the possibility of some data getting changed "out
from under"
the backup process. In the premiere issue of Sys Admin
(May/June
1992), I described the spooling overnight job scheduler
I designed
for our company's UNIX system. With a substantial number
of variable-length
jobs executing sequentially each night, running a backup
process at
a fixed hour was not a good idea: if we had to restore
from a crash
the next day, it would always be questionable whether
the versions
on the tape were newer or older than the ones on disk.
The answer
was to set up our backup driver script as "just
another" overnight
spooler job, albeit the one with the lowest priority.
Now, the backup
process doesn't begin to happen until all the other
jobs have finished
running.
To provide an example of how CTAR and dataset work together,
Listings 1-5 show the actual driver scripts I wrote
to perform our
backups. Several times a day (for redundancy, in case
the system is
down during one of the scheduled run times), spooldumps.sh
(Listing 1) runs from the cron table to spool dump.sh
(Listing 2) for overnight execution at the lowest possible
priority.
No other jobs are ever spooled at priority 7, to ensure
that dump.sh
runs only after all other overnight tasks have finished.
With /u3/Backup
as the working directory (set by the cd statement in
spooldumps.sh),
dump.sh provides some logging around the main backup
driver
script, dailybak.sh (Listing 3).
To keep a complete record of all the night's backup
activities in
a single cumulative file, dailybak.sh makes use of CTAR's
automatic logging mechanism. The environment variable
CTARFILE
specifies the name of the file to which CTAR will write
its
logging information. After each backup step, dailybak.sh
appends
the contents of the log file just written to a cumulative
file named
by INCRLOG.
dailybak.sh maintains the four previous cumulative logs
in
FIFO sequence, with INCRLOG.2 the next most recent and
INCRLOG.5
the oldest. dailybak.sh positions the tape for each
partition's
backup using dataset, and calls one of two scripts to
perform
the actual backup: incrbak.sh (Listing 4) for the incrementals
or fullbakp.sh (Listing 5) for the full dump of /u4.
The filesystems we're most likely to want to recover
files from are
saved first on the tape for quickest access. /u is the
filesystem
containing users' home directories; since accidental
user file deletion
is the most common reason for accessing the backup tape,
we back up
/u in the first logical tape volume.
To perform full backups, two variations of the driver
script are available
(as links). fullbakp.sh, as invoked by dailybak.sh,
uses CTAR's data compression facility, while the other
link,
fullbak.sh, does not use compression. We use compression
during
the nightly backup to fit everything onto a single tape.
Individual
full backups, usually performed manually on the weekends,
are written
without compression (whenever there is enough room on
the tape) for
maximum speed. With CTAR 3.4's new S option for double
buffering, however, along with generally increasing
CPU speed, there
is no longer all that much speed difference between
compressed and
uncompressed backups. They all go about as fast as the
hardware can
support.
To wrap up this case study, let's take a look at the
command lines
in incrbak.sh and fullbak.sh that actually invoke CTAR.
First, the OPTIONS variable is set to the option list
appropriate
for each variation. The options used in every case are:
c,
to create an archive, 8, to select the device; and V
to create the log file named by the CTARFILE environment
variable.
The entry for device 8 in our /etc/default/ctar configuration
file happens to be:
archive8=/dev/erct0 120 150000 y # error correcting tape
so CTAR gets the device name, block, and tape
size parameters right out of the file. The format of
this configuration
file is exactly the same as that for standard tar's
configuration
file, /etc/default/tar.
The option list for incremental backups in incrbak.sh
also
includes I, to specify incremental, and P, to force
data compression. When P is specified, CTAR (by default)
compresses all non-executable files that are at least
15 blocks in
length, except for those files ending with standard
compressed-file
extensions (e.g., .Z, .zip, etc.) Many additional options
are available to control compression criteria, including
the ability
to exclude certain file extensions or entire directory
trees from
being compressed. If you have compressed files with
non-standard extensions,
however, and you don't specify all the right options
to inform CTAR
about them, your files could conceivably end up with
negative compression.
You can usually avoid that, however, through proper
use of the compression-related
options.
For full backups via fullbak.sh, the option list includes
M
for "Master backup." This instructs CTAR to
create
a log file named ./etc/Master_backup in the root directory
of the filesystem being backed up. This log file contains
the date
of the backup, and is used by CTAR to determine which
files
to back up in subsequent incremental backups of the
same filesystem.
The M option also automatically triggers certain other
options
appropriate for a master filesystem backup, such as
R, which
will lock files (against modification by other programs)
while they
are being backed up.
Both incrbak.sh and fullbak.sh handle a root filesystem
backup differently than those for other filesystems.
If the filesystem
being backed up is not the root, both scripts invoke
a cd command
before CTAR to perform the backup relative to the individual
filesystem's own root directory. When backing up the
system root,
however, the other filesystems are still mounted; if
we didn't take
steps to prevent it, CTAR would attempt to write the
data from
all mounted filesystems to tape. To limit the root backup
to data
on the actual root filesystem, we use CTAR's "exclude"
option to specify the mount points for all non-root
filesystems. For
example, the command sequence:
cd /
ctar MV8EEEEEE /u /u1 /u2 /u3 /u4 /u5 .
would create a Master backup (c is one of the
options implied by the M option), with logging, to archive
device 8, and include all files and subdirectories of
the root except
/u through /u5. Each E in the option string matches
up with one of the list of directories to be excluded.
The final .
in the command line specifies the base directory for
the backup, which
in this case is the system root.
Crash Protection with Airbag
We ran XENIX for a long time before moving up to UNIX,
and the version
of CTAR UniTrends distributes for XENIX came bundled
with a
crash-recovery package named Jet RestoreEase. The UNIX
version is
called "System Crash Airbag," but it is still
essentially
the same software. Despite the awkward appellations,
this package
did an admirable job of integrating CTAR and some custom
scripts
onto a highly robust boot/root disk set for use in disaster
recovery.
Airbag's quality as a root/boot generator becomes apparent
when you
first go to create the disk set. Airbag supplies exhaustive
annotation
of exactly what it is doing during each nitty-gritty
step of the process.
You can see each file as it is being written to the
floppy, watch
the kernel get automatically compacted and compressed,
and see
how much space is available at every juncture. Airbag
asks you questions
about your system and customizes the boot set accordingly.
It even
suggests how to verify the integrity of the boot set
and master backup
tape without jeopardizing the integrity of your existing
system.
Airbag in Action
I now run UNIX at home on my "old" machine,
a 386-25 with
a Colorado Jumbo 250 tape drive running off the floppy
controller.
I've been having some problems lately with this system's
two old ESDI
drives, and several bouts with hard-disk errors have
resulted in a
trashed filesystem. Since I never really trust a drive
after a crash
unless I can get it to successfully perform a low-level
format, I've
had plenty of opportunities to let my Airbag/CTAR backup
rebuild
my entire system for me.
The very first time I had to do this on my system, it
had been quite
a while since the last time I'd gone through the process,
and that
was under XENIX with Jet RestoreEase. I was tired, frustrated
and
impatient; I concluded prematurely that Airbag wasn't
going to work,
and re-installed the entire UNIX system from the original
installation
disks and various masters and backup disks. I created
a new boot/root
set and master backup tape; when my hard drives went
bonkers again
soon thereafter, I created a new Airbag disk set and
realized why
my efforts the first time had met with failure. Airbag
is hard-wired
to restore from tape device /dev/rct0 with a blocking
factor
of 120 by default. When Airbag asked me for the device
name, I had
looked at it and presumed that default to be correct;
actually, all
I had needed to do was to override the default with
the actual device
name of the Colorade tape drive on my system, /dev/rjt0,
and
give it a blocking factor of 20. When I'd accepted the
defaults of
/dev/rct0 and 120, of course it couldn't read my drive!
I've suggested that Airbag be made smart enough to know
the correct
default tape device for the particular system on which
the boot/root
disk set was created.
Floppy-Based Tape Decks Redeemed by Airbag
During creation of the boot/root set, Airbag asks whether
or not the
tape drive is running off the floppy controller. This
is an important
question, since it seems that with floppy-based tape
decks it is impossible
to boot off a floppy disk and then perform a restore
from the tape
drive when the active system is still running off the
floppy disk.
To use the OEM boot/root software that came bundled
with my floppy-based
tape drive, I first had to obtain a dedicated controller
card. Airbag,
however, provides a way around this whole problem without
requiring
the purchase of a dedicated tape controller card.
From my experience, here's how the process goes when
you have to rebuild
an entire UNIX system from a floppy-based tape deck
onto a newly formatted
hard disk, given an Airbag/CTAR backup disk set:
1. Boot off the A1 boot floppy, which involves a cold
boot to the
standard SCO boot prompt, and then pressing the Return
key
to load the kernel from floppy. When the kernel is in
memory, it asks
for the root floppy, A2.
2. Insert A2 and press Return to complete the boot process.
This puts you at the main Airbag menu (see Figure 1).
3. Choose option #4 to restore to a re-initialized disk
drive.
4. At this point, Airbag takes you through much the
same sequence
of utilities as the standard XENIX/UNIX installation
does: enter hard
disk parameters, configure the partition table, perform
bad tracking,
set up block-by-block logical partitions, etc. Whenever
practical,
Airbag tells you how the system was set up in its last
incarnation,
so you can reproduce the same exact configuration. If
you want to
change it, however, you are free to do that.
5. Once all the preliminaries are out of the way, Airbag
announces
it is about to set up to "Restore the Master Backup
for Floppy-Based
Tape"; this process involves setting up an entire
bootable system
image on the hard disk, but in such a way that no harm
comes to the
still-possibly-valid data on the hard disk (if any).
(The precise
manner in which Airbag accomplishes this feat is proprietary,
but
any halfway-conscious system administrator ought to
be able to figure
out how it's done from the messages generated by the
software.) Setting
up for re-boot takes about three minutes.
6. At this point, Airbag takes you through about six
screens of instructions,
cautions, and warnings all boiling down basically to
this: wait for
the system to shut down, plug the A1 floppy into the
drive, boot it,
and enter the string "again" when the boot
prompt appears.
That isn't very complicated to say, but in its zeal
to be ultra user-friendly,
Airbag managed to make me feel so secure that I followed
its
instructions literally and ended up with a corrupted
A2 root floppy.
The last screen of the message sequence says, at the
very top, "Remove
disk A2 and insert boot disk A1." When that screen
pops up, however,
the system is still writing some information onto disk
A2. If you
wait long enough, eventually the floppy is unmounted
and the "Safe
to Reboot" message appears; however, the warning
to wait for that
message before removing the floppy is, at this point,
several screens
back in Airbag's message sequence, and by the time I'd
reached the
final screen I had forgotten about that warning. So,
I proceeded to
do exactly what it said: I switched floppies, without
looking too
hard at the busy light on that drive. Of course, we
all know what
happens when you remove a mounted floppy in the middle
of an I/O operation!
In all fairness, Airbag does warn you to create an extra
copy of the
A2 root disk in case something like this happens. I,
naturally, had
ignored the warning.
7. After waiting for the shutdown message and rebooting
from the A1
boot floppy, the SCO boot prompt appears. At this point,
instead of
hitting the usual Enter key, type "again".
This instructs
the boot program to read the kernel off the floppy,
but use the magic
temporary area on the hard disk for the root filesystem.
After the
kernel "wakes up" with a hard-disk-based root
filesystem,
a program runs that asks you to mount the backup tape
in the tape
drive and prompts for the tape device and blocking factor,
with /dev/rct0
and 120 the defaults, as mentioned previously.
8. After a few more questions - such as "Do you
want to overwrite
existing data?" and "Does this backup span
multiple volumes?"
- the tape drive fires up and the Master root backup
is restored.
After that, it asks if you have any incrementals you
want to restore.
If so, it presumably loads those in as well (I've never
had any to
restore, but there's no reason at this point to suspect
it wouldn't
work) and shuts down when finished.
9. Now make sure there's no floppy in the drive, and
boot up normally.
What you should see is your perfectly restored UNIX
system, just as
it was at the time of the last tape backup.
If your tape drive has its own dedicated controller,
then the procedure
is the same except that you'll skip steps 5-7. If you're
not worried
about the integrity of the drive, you can skip all the
low-level configuration
utilities by choosing selection #1 from the main menu,
and let Airbag
automatically set everything up for you exactly the
way it was before.
I haven't seen that one work, because all my restore
attempts have
come after pretty catastrophic crashes, but I have no
reason to doubt
that it would work.
If you don't want to jump right in and attempt a tape
restore after
booting off the Airbag disks after a crash, you can
invoke a shell
from the main menu (option #7) and poke around a bit
first to see
how bad the damage was. Also, you can bring up the Utilities
menu
(Figure 2) and attempt to repair the disk yourself.
This might be
necessary if you've made some significant filesystem
or system configuration
changes since writing out your last root filesystem
backup and/or
creating your most recent Airbag disk set.
Other CTAR Features
When you enter the CTAR command with no arguments, the
help
screen shown in Figure 3 displays. As you can see, there
are lots
of options and features, only a few of which I describe
here.
CTAR supports a limited use of wildcards during restore
operations.
The wildcards must appear at the start or the end of
a string, but
not in the middle. This is still pretty useful for selective
restores.
The Z option forces compression of executable files.
By default,
CTAR does not compress executables.
The N option specifies a non-destructive restore. This
tells
CTAR to restore only files that do not already exist
on disk.
When you press the INTR key (usually DEL) during a CTAR
backup, it doesn't just abort; instead, you're presented
with a mini-menu
(Figure 4), and you get to select whether to "really
abort,"
finish the current file and then abort, or just continue
with normal
operation. This is handy, especially when you accidentally
hit the
INTR key after most of a lengthy backup!
Normally, when a program runs in the background, you
can't interact
with it. CTAR, however, provides a way to let you talk
to it
when it is running in the background. If some kind of
error occurs
and CTAR requires interactive input, it sends its prompt
message
to standard output (nothing unusual there), but it reads
its input
from a special device called /dev/ctar_listen. Thus,
if CTAR
asks you a question requiring a yes or no response,
you just send
your response to that device by saying:
echo y > /dev/ctar_listen
and CTAR will hear it while running in the background.
CTAR provides full support for virtual files, including
the
ability to restore them in their virtual entirety, "holes"
and all. However, you must prepare a text file containing
the list
of all such virtual files, and tell CTAR the location
of that
file via the VIRTUAL_LIST environment variable. CTAR
won't detect virtual files without this assistance.
The Menu System
A recent addition to the CTAR package is a full-screen
interactive
menu interface. Invoked by the ctmenu command, the menu
interface
allows users to compose and catalog customized CTAR
command
strings and run them on demand, configure the device
library and unattended
backup schedule, invoke utilities such as dataset (if
present),
view and/or purge CTAR log files, and generally provide
a user
interface that shields users from the command line use
of CTAR.
A tech support sub-menu helps you prepare various kinds
of problem
report sheets to FAX to UniTrends should you happen
to need support
(phone support is also available at no additional charge).
With a command as versatile as CTAR, it can take some
time
to figure out the right combination of options when
you're trying
to do something unusual by hand. The menu system is
handy for anyone
who would like to perform such backups without having
to browse through
the reference guide for all the needed options.
In Summary
CTAR, dataset, and Airbag/Jet RestoreEase have been
an essential part of my system administrator's arsenal
for several
years, and I'd be really worried if I didn't have a
current Airbag
backup of my UNIX system. Under XENIX, the combination
of CTAR/Jet
RestoreEase and a floppy-based tape deck provides cheap,
easy, and
reliable backup security (CTAR and Airbag cost slightly
more
for UNIX, where they do not come bundled).
About the Author
Leor Zolman wrote BDS C, the first C compiler targeted
exclusively for
personal computers. Leor is currently an instructor
on UNIX topics for
Boston University's Corporate Education Center, a regular
contributor to
The C Users Journal and Sys Admin magazines, and "Tech
Tips" editor for
Windows/DOS Developer's Journal. His first book, Illustrated
C, was recently
published by R&D Publications, Inc. He may be contacted
at 74 Marblehead St.,
North Reading, MA 01864, or on Usenet/Internet as: leor@bdsoft.com.
|