Fast Backup and Restore Scripts for DAT Drives
Jon Alder and Ed Schaefer
With the advent of 8mm and digital-analog (DAT) cartridge
tape drives,
the volume of data capable of being stored to tape has
increased to
the gigabytes range. The time a system administrator
requires to search
a tape for specific file(s) has also increased accordingly.
If the
traditional sequential tape search could be replaced
by a direct access
method, the time needed to restore specific files(s)
would be significantly
reduced.
Creating a tape with related directories in their own
partition or
volume could emulate a direct access method; the speed
advantage would
be skipping unwanted volumes and placing the tape head
at the beginning
of the volume which contains the desired file(s). With
DAT tape drives,
you can create multi-volume tapes with ordinary UNIX
commands. This
article presents backup, restore, and report shell scripts
for controlling
DAT drives in such a manner. The heart of the model
is creating multiple
backups on the same tape. One backup is defined as a
Tape File System
(TFS) with each TFS residing in its own volume.
The Tape File Mark
When you write to a DAT drive without rewinding using
a command such
as cpio or tar, a tape file mark (TFM) is written on
the tape between each backup. Think of the TFM as dividing
the tape
into separate volumes. Figure 1 shows a TFM between
two different
tape file systems, TFS1 and TFS2.
Each TFM has a beginning side (BOT) and an ending side
(EOT). To move
between BOT and EOT of a Tape File Mark, and to move
to other Tape
File Systems on the tape, a device driver called a tape
controller
program is required. Because of the differences in these
tape controller
programs, fast restore will always go to EOT of the
mark before the
TFS requested. The tape is rewound to the physical beginning
of the
tape between tape requests.
The Design
The backup script, fast.bu (Listing 4), reads a file
containing
a list of directories or file systems to be backed up.
First, fast.bu
creates the first backup, referred to as volume 0, with
a file containing
the list. Then each file system entry in the list is
written to tape
in succession.
The restore script, restore.bu (Listing 5), performs
maintenance
on a tape created by fast.bu. restore.bu restores the
volume 0 backup_list to disk and then displays a menu
which allows
an administrator to choose a Tape File System to view
or restore.
The report script, restore.bu reports statistics on
the last
fast backup.
Configuration and Portability
The three scripts use a configuration file which parameterizes
the
differences in mini-cartridge tape drives, tape controller
programs,
and Unix versions. These scripts have been tested with
the IBM RS-6000
(AIX), AT&T 3B2/600 (UNIX), and the HP 9000 (HP-UX).
Below is an example configuration entry for positioning
the cursor
on the screen:
cursor:cursor :Set the cursor position
The entry is divided into three colon-delimited fields.
The first field is the shell script parameter name,
the second is
the value, and the third is a comment. Listing 1 and
Listing 2 show configuration
system variables for AIX and HP-UX, respectively.
Typically, the Terminal Independent Operation Command,
tput
with "cup" argument, would be used to position
the cursor,
but since the AIX tput doesn't support the "cup"
argument, another substitiute is used. Listing 3
is a C program, cursor.c, which accepts the screen
coordinates
as arguments, with the upper left-hand corner being
0,0.
Backup List
Before creating a tape with fast.bu, you must create
a file
that lists each file system to be backed up, one per
line. Below are
three directories, /usr, /etc, and /jet, in the
backup list:
usr
etc
jet
These directories are backed up relative to root (/)
so that in a restore you could restore one or more of
them to another
disk location if you chose.
The fast.bu Script
Executing fast.bu
fast.bu requires the configuration file "-c"
and the
backup list of directories "-l" as arguments:
fast.bu -c /usr/bu/backup/config -l /usr/bu/backup/backup_list
Command-line Arguments
fast.bu checks for the existence of the arguments and
defaults
the configuration file to config and the backup list
to backup_list
in the present working directory (lines 23 to 37). To
check whether
two command-line arguments exist, the second command-line
argument
is appended to a value, x; if "$2" is not
defined,
then x will equal x (line 23).
fast.bu supports a variable number of command-line arguments
since the space between the switch and the file location
may or may
not exist:
fast.bu -c/usr/bu/backup/config -l/usr/bu/backup/backup_list
The while loop/case statements (lines 40 - 72) support
a variable number of arguments. The shell shift command
shifts arguments
to the left ($2 to $1, $3 to $2, etc) and decrements
the number of
variables, "$#", by one.
Initialization
The shell variables are set by searching the configuration
file for
a particular pattern and cutting out the second field
delimited by
a colon (lines 94 - 102).
The temporary files located in the working directory
are nulled out
or initialized with the system date (lines 112 - 119).
The temporary
files are used so the report.bu script can report on
the status
of the last backup.
If the preprocess.bu file exists in the program directory,
each line in the file is executed by the shell (lines
124 - 130).
If these commands are related to performing some function,
they should
be in a shell script of their own that is called by
preprocess.bu.
Create Volume 0
fast.bu writes the backup list file, named tape.directory,
to tape, using cpio and no rewind device to create volume
0
(lines 135 - 139). The HP-UX version of cpio uses the
"R"
for Re-syncronization argument (see the configuration
file in Listing 2).
HP-UX seems to have trouble reading some tape headers
and if cpio
is not told to (R)e-syncronize, it fails.
Create the Main Backup
Each of the directories in the backup list is then written
to tape.
The tty command determines whether the backup is running
in
the foreground or the background (line 147). If the
command does not
fail (exit status 0), the backup is running in the foreground.
Foreground Backup
In foreground backup, for each file system the number
of files to
be backed up and the estimated amount of time is displayed.
As each
file system backup progresses the percentage completed
and the total
time is displayed (see Figure 2).
For each file system, the number of files to back up
is determined
(line 154); the cpio command is executed in the background
with the no rewind device. The while loop (lines 161
- 214)
continues until the background cpio command completes
(line
165) and the next file system is backed up.
Within the while loop, the following variables are calculated
and displayed:
HOW_MUCH -- This is how many files have
actually been backed up. The number is obtained by checking
the line
count in cpio's output file (line 170). For users of
HP-UX:
the cpio command creates an extra output line stating
the reel
number being used. This bumps the number of files done
by 1.
PERCENT -- This is just percentage completed
as determined by an awk script (line 176).
TIME_SO_FAR -- This is the amount of minutes
between the system time and when this file system started
backing
up. The starting and present time are determined by
the date command
(lines 157 and 182) as the number of days, hours, and
minutes since
the first of the month. The awk script (line 185) accepts
the start
time and now strings, splits them into two arrays of
three elements
each using an awk internal function call, and then does
the calculation.
Be aware that awk arrays have a character index. For
more information,
see The AWK Programming Language. Notice the two arguments
to the awk script are in quotes. If they weren't, the
two arguments
would be six.
ETTC -- This is the estimated time to completion
determined by another awk script (line 199). This is
a gross approximation
since the awk script assumes that each file takes the
same amount
of time to backup. Surprisingly, when backing up hundreds
of megabytes
this approximation is close.
Background Backup
If the backup is running in the background or from cron
(lines
226 - 237), each file system is backed up without displaying
file
and time information.
Report Section
After the backup is complete, each file system is displayed
with its
approximate size; the total approximate size in megabytes,
the total
time in hours, and the average rate in magabytes/hour
are also displayed
(lines 242 - 282). If the backup had been run from cron,
the
output of the report would have been mailed to the cron
user (usually
root). Figure 3 shows a sample report.
If the kernel has been defined in the configuration
file, the script
then backs up the kernel and rewinds the tape. If not,
it simply rewinds
the tape (lines 287 - 292). For HP-UX users: the cpio
command
refuses to back up the kernel, so we recommend eliminating
the kernel
definition from the configuration file.
If the postprocess.bu file exists in the program directory,
each line in the file is executed by the shell (lines
297 - 303).
Once again, if these commands are related to performing
some function,
they should be in a shell script of their own.
The restore.bu Script
Executing restore.bu
Once a fast backup tape has been created, you need a
method of accessing
the tape. restore.bu provides that method. When executing
restore.bu,
you'll use the same configuration file as for fast.bu:
restore.bu -c /usr/bu/backup/config
Once configuration is completed, restore.bu rewinds
the tape to the physical beginning and restores the
directory file,
tape.directory, from volume 0 (lines 94 - 96). The UNIX
print
format command, pr, creates two temporary files that
are used
to maintain and display the tape file systems (lines
98 - 99).
When the preliminary setup is complete, the script defines
all the
functions needed (lines 101 - 316). Then only three
top-level function
calls control the script (lines 320 - 322).
The Initial Display
The call to function DIS_dir() clears the screen, displays
a heading with the creation day and time, and displays
the file systems
available in columns of three (lines 101 - 110).
The call to function FTS_dir() prompts for the number
of a
file system to access, verifies the choice, and places
the tape head
at the TFM at the beginning of the volume chosen (lines
112 - 175).
Figure 4 shows an example for 5 tape file systems after
DIS_dir
and FTS_dir have been executed.
From the example in Figure 4, if the administrator chooses
TFS 4,
var, the tape head will fast forward ahead three tape
file marks,
since the head is already at the start of the first
TFS (lines 166
- 174).
The Tape Management Menu
After the user chooses a TFS, the function nrestore_menu
(lines
177 - 234) displays the main menu (see Figure 5). The
menu options
available, along with a description of what each does,
are as follows.
View file system files
The view file system option provides a table of contents
using the
cpio command (line 219). The tee command redirects the
output of the cpio command to the file tfs_list located
in the working directory, and to the pg command.
If pg is terminated before the end of the cpio command,
only
those files displayed to that point exist in the tfs_list
file.
By default, pg displays 23 lines of text; to eliminate
the
need to continue striking the Enter key, at the pg prompt,
change the window size by entering iw where i is the
new window size (i. e. 5000w).
When the view completes or terminates, the restart function
(lines
298 - 308) rewinds the tape, positions the tape head
at the beginnning
of the first volume, and then executes functions DIS_dir
and
FTS_dir redisplaying the Tape Managemenut Menu.
Restore a File
Function RF_menu (lines 236 - 275) restores file(s)
to the
system. The administrator must answer the following
prompts:
1) Please enter the complete restore path. Wildcards
are also
supported. For example, to restore the /usr2 and all
child directories,
enter: usr2/*.
2) Do you want to overwrite existing files? If the answer
is (Y)es,
build the cpio argument variable with the unconditional
copy switch.
3) Is this a high priority restore? If the answer is
(Y)es, build
the cpio variable preceded by the NICE parameter as
set by the configuration
file (line 263). In order to raise the restore priority
with the nice
command, log in as root.
Now the files are restored using the variables previously
built (line
271). When the restore completes or aborts, the restart
function executes
as previously described.
Rewind Tape Drive and Exit
Choosing this option executes the rewind function, which
rewinds the
tape to its physical beginning and terminates the script
(line 224).
Choose another Tape File System
Choosing this option executes the previously described
restart function,
which redisplays the Tape Management Menu.
Exit (Leaving tape at current position)
Choosing this option terminates the script without rewinding
the tape.
Now an administrator familiar with the tape commands
and volume structure
can act on a tape volumne from the command line. Before
re-using the
tape with fast backup, be sure to rewind the tape, since
fast backup
does not rewind before creating volume 0.
The report.bu Script
Listing 6 is the report.bu script, which reports on
the last fast
backup performed using the temporary files left in the
working directory.
It provides the same backup statistics as for the fast
backup script.
Conclusion
This article has provided a method for creating fast
backup and restore
procedures using DAT tape drives and UNIX shell programing.
With some
prior planning, you can divide the system directories
into Tape File
Systems and thus further increase the speed of DAT tape
drives.
References
Anderson, Craig."FAQ: Info About Tape Drives."
comp.unix.aix, April 1993.
Strang, John, Tim O'Reilly, Linda Mui. Termcap
and Terminfo. O'Reilly & Associates, 1989.
Aho, Alfred, Brian Kernighan, Peter Weinberger. The
Awk Programming Language. Addison-Wesley, 1988.
Kochan, Stephen, Patrick Wood. Unix Shell Programming.
Hayden Books, 1987.
Prata, Stephen, Donald Martin. Unix System V
Bible, Commands and Utilities. The Waite Group, 1987.
About the Authors
Jon Alder has been working on open systems for 10 years,
most of
the time in systems design and management. He was the
Director of MIS
for Marie Callenders Restaurants and now works for Columbia
Sportswear in
the Developement division of the MIS department.
Ed Schaefer is an Informix software developer and UNIX
system administrator
at jeTECH Data Systems of Moorpark, CA, where he develops
Time and
Attendance Software. He has been involved with UNIX
and Informix since
1987 and was previously head of software development
at Marie Callendar
Pie Shops of Orange, CA.
|