Cover V05, I07
Article
Listing 1
Listing 2
Listing 3
Sidebar 1
Table 1
Table 2

jul96.tar


Sidebar: How Safe Is Your Online Backup?

Cory Bear

System administrators, backup operators, and others who evaluate backup products will often see the phrase "online backup" in the advertisements for backup products. However, not all vendors mean the same thing by this term.

Perhaps the most famous backup program that runs on the UNIX operating system is the "dump" program. But if the dump program is famous, that is mostly because it is included free in many versions of UNIX, not because of superior functionality. The designers of dump programmed it to read file information directly from the disk driver, rather than the filesystem, to optimize the program's speed.

Backup operators, however, soon found out that if you use dump to back up a filesystem, then that filesystem must be unmounted during the process. Because users can't work when a filesystem is disabled in this way, these backups have to be done "off-line" -- on weekends or late at night. Backup operators also wanted the flexibility to run backups "online" (during regular working hours), so several commercial backup products were created that offer online functionality.

One way to provide this functionality is for the tool to read file information from the filesystem, rather than the device driver. This method is less direct, but it allows the backup operation to proceed while the filesystem is mounted. The backup program simply walks through the filesystem, reads a file, copies that file to tape, and then moves onto the next file. Many vendors refer to this process as online backup because users can access the filesystem during the backup operation. This capability suggests a flaw in the process: if a user writes to a file as it is being backed up, then that file may not be backed up properly!

Keep in mind that the goal of a backup program is to save a "snapshot" of your files as they existed at some instant in time. For example, if there is a file named BANKSLIP that contains the characters "DEPOSITS $9000," and you take a snapshot of BANKSLIP, then that snapshot will also contain the characters "DEPOSITS $9000." Unfortunately, it is not always possible to save an accurate copy of a file if the contents of that file can change as the file is being copied (e.g., if a user can write to the file as it being backed up).

For example, suppose a backup program tries to back up the file BANKSLIP. It starts by reading the first eight characters, namely "DEPOSITS," but at that moment, a user changes the file's contents to "WITHDRAW $1000." Then, the backup program continues to read the remaining six characters of BANKSLIP, namely " $1000." The backup program will then save the contents of BANKSLIP as "DEPOSITS $1000."

A backup program can avoid this problem by reading files atomically. An atomic operation is completed entirely before another operation is allowed to commence. For example, in an atomic read, a file is completely read before anyone is allowed to write to that file.

One way to implement an atomic read is to "lock" a file while it is being read. Locking a file effectively keeps any other person or program from writing to that file while it is being read. This technique may inconvenience anyone who gets blocked, but it will solve the BANKSLIP problem.

A better way to implement an atomic read is to use filesystem cloning. A filesystem "clone" is a copy, or snapshot, of an entire filesystem as it exists at a given time. The clone operation is also an atomic operation, so, if a backup program first clones the filesystem, then backs up the clone, it can prevent the problem with the BANKSLIP file. In fact, it also prevents a related problem of maintaining version synchronization between files.

For example, suppose there are two files named BANKER_COMMAND and BANKER_AMOUNT, which contain "DEPOSIT" and "$9000," respectively. The banking program that uses these files follows a rule stating that if the program modifies one of these files, then it must immediately modify the other to ensure that the versions are always synchronized. Unfortunately, if a backup program reads one of these files while the banking program is writing to the other then the contents of those files will not be properly synchronized. Fortunately, filesystem cloning prevents this problem as well.

Another issue to consider when choosing an online backup product is resource consumption. All backup products consume computer resources as they run (e.g., CPU cycles to run the backup programs, I/O bandwidth to save files on local tape drives, and network bandwidth to save files on remote tape drives). If the backup product consumes too much of your computer resources, it can impair or prevent users from working on that computer. If this happens, then the backup product is effectively not performing an online backup.

Resource consumption is most important for computer systems with large amounts of data. Sometimes there is so much data that the backup program must run almost continuously. In such cases, it is particularly important that the backup program be unobtrusive to the user community, because the users will need to work while the backup program is running.

Incremental backups are one way to conserve resources. Unlike ordinary, full backups, incremental backups save only those files that have changed since the last backup. Incremental backups therefore consume fewer computer resources than full backups because fewer files are saved per backup operation. It is difficult, however, to recover a complete filesystem from incremental backups, because each incremental backup contains only a part of the backup data. Products that combine incremental backups into full backups on a backup server can help solve this problem.

A related technology is hierarchical storage management (HSM). HSM is a multilevel caching system in which files are migrated from a high-cost data storage media (e.g., a hard disk) to a low-cost data storage media (e.g., a tape). Files are automatically migrated to the low-cost media as they age, until the only files left on the high-cost media are those that have recently been accessed or modified. As a result, a backup of the high-cost media is similar to the incremental type of backup.

There are many questions to ask your backup vendor before buying an online backup product: Can it perform a backup on a mounted filesystem? Can it backup a file safely while a user is writing to that file? If the versions of various files need to be synchronized, will the backup product preserve this synchronization? Will the product use so many resources that user activity will be impaired? If possible, demo the backup product on your own computer system with realistic amounts of data before making a purchase decision.