Sidebar: How Safe Is Your Online Backup?
Cory Bear
System administrators, backup operators, and others
who evaluate backup
products will often see the phrase "online backup"
in the advertisements
for backup products. However, not all vendors mean the
same thing by
this term.
Perhaps the most famous backup program that runs on
the UNIX operating
system is the "dump" program. But if the dump
program is famous, that is
mostly because it is included free in many versions
of UNIX, not because
of superior functionality. The designers of dump programmed
it to read
file information directly from the disk driver, rather
than the
filesystem, to optimize the program's speed.
Backup operators, however, soon found out that if you
use dump to back
up a filesystem, then that filesystem must be unmounted
during the
process. Because users can't work when a filesystem
is disabled in this
way, these backups have to be done "off-line"
-- on weekends or late at
night. Backup operators also wanted the flexibility
to run backups
"online" (during regular working hours), so
several commercial backup
products were created that offer online functionality.
One way to provide this functionality is for the tool
to read file
information from the filesystem, rather than the device
driver. This
method is less direct, but it allows the backup operation
to proceed
while the filesystem is mounted. The backup program
simply walks through
the filesystem, reads a file, copies that file to tape,
and then moves
onto the next file. Many vendors refer to this process
as online backup
because users can access the filesystem during the backup
operation.
This capability suggests a flaw in the process: if a
user writes to a
file as it is being backed up, then that file may not
be backed up
properly!
Keep in mind that the goal of a backup program is to
save a "snapshot"
of your files as they existed at some instant in time.
For example, if
there is a file named BANKSLIP that contains the characters
"DEPOSITS
$9000," and you take a snapshot of BANKSLIP, then
that snapshot will
also contain the characters "DEPOSITS $9000."
Unfortunately, it is not
always possible to save an accurate copy of a file if
the contents of
that file can change as the file is being copied (e.g.,
if a user can
write to the file as it being backed up).
For example, suppose a backup program tries to back
up the file
BANKSLIP. It starts by reading the first eight characters,
namely
"DEPOSITS," but at that moment, a user changes
the file's contents to
"WITHDRAW $1000." Then, the backup program
continues to read the
remaining six characters of BANKSLIP, namely "
$1000." The backup
program will then save the contents of BANKSLIP as "DEPOSITS
$1000."
A backup program can avoid this problem by reading files
atomically. An
atomic operation is completed entirely before another
operation is
allowed to commence. For example, in an atomic read,
a file is
completely read before anyone is allowed to write to
that file.
One way to implement an atomic read is to "lock"
a file while it is
being read. Locking a file effectively keeps any other
person or program
from writing to that file while it is being read. This
technique may
inconvenience anyone who gets blocked, but it will solve
the BANKSLIP
problem.
A better way to implement an atomic read is to use filesystem
cloning. A
filesystem "clone" is a copy, or snapshot,
of an entire filesystem as it
exists at a given time. The clone operation is also
an atomic operation,
so, if a backup program first clones the filesystem,
then backs up the
clone, it can prevent the problem with the BANKSLIP
file. In fact, it
also prevents a related problem of maintaining version
synchronization
between files.
For example, suppose there are two files named BANKER_COMMAND
and
BANKER_AMOUNT, which contain "DEPOSIT" and
"$9000," respectively. The
banking program that uses these files follows a rule
stating that if the
program modifies one of these files, then it must immediately
modify the
other to ensure that the versions are always synchronized.
Unfortunately, if a backup program reads one of these
files while the
banking program is writing to the other then the contents
of those files
will not be properly synchronized. Fortunately, filesystem
cloning
prevents this problem as well.
Another issue to consider when choosing an online backup
product is
resource consumption. All backup products consume computer
resources as
they run (e.g., CPU cycles to run the backup programs,
I/O bandwidth to
save files on local tape drives, and network bandwidth
to save files on
remote tape drives). If the backup product consumes
too much of your
computer resources, it can impair or prevent users from
working on that
computer. If this happens, then the backup product is
effectively not
performing an online backup.
Resource consumption is most important for computer
systems with large
amounts of data. Sometimes there is so much data that
the backup program
must run almost continuously. In such cases, it is particularly
important that the backup program be unobtrusive to
the user community,
because the users will need to work while the backup
program is running.
Incremental backups are one way to conserve resources.
Unlike ordinary,
full backups, incremental backups save only those files
that have
changed since the last backup. Incremental backups therefore
consume
fewer computer resources than full backups because fewer
files are saved
per backup operation. It is difficult, however, to recover
a complete
filesystem from incremental backups, because each incremental
backup
contains only a part of the backup data. Products that
combine
incremental backups into full backups on a backup server
can help solve
this problem.
A related technology is hierarchical storage management
(HSM). HSM is a
multilevel caching system in which files are migrated
from a high-cost
data storage media (e.g., a hard disk) to a low-cost
data storage media
(e.g., a tape). Files are automatically migrated to
the low-cost media
as they age, until the only files left on the high-cost
media are those
that have recently been accessed or modified. As a result,
a backup of
the high-cost media is similar to the incremental type
of backup.
There are many questions to ask your backup vendor before
buying an
online backup product: Can it perform a backup on a
mounted filesystem?
Can it backup a file safely while a user is writing
to that file? If the
versions of various files need to be synchronized, will
the backup
product preserve this synchronization? Will the product
use so many
resources that user activity will be impaired? If possible,
demo the
backup product on your own computer system with realistic
amounts of
data before making a purchase decision.
|