Backups -- In a League of Their Own
W. Curtis Preston
There was a time when backups could be and would be ignored. They were usually assigned as a collateral duty to the most junior systems administrator, and that poor person had to fight to buy every piece of hardware and every piece of media that they needed to get the job done. The hardest part about backups was swapping the tapes. Boy, have things changed!
The Good Old Days
When I first started in this business about 9 years ago, we backed up the entire department to five 8-mm tape drives and a few QIC tapes -- 60-MB QIC tapes at that! I rode shuttles from building to building, going to the different computer rooms to remove the previous night's tapes. Then I brought them all back to a central place, used a tape duplicating machine to copy them, and sent the originals to our off-site storage facility.
Then everything changed. We bought our first system that had more data than would fit on a tape. I tried adapting our little shell script that was never intended for such a job. Eventually, we gave up and bought our first commercial backup software package and a few tape libraries. Finally we could get our backups done without swapping tapes. We couldn't have been happier! Unfortunately, that wasn't good enough. Although we had a fine solution for our file system data, the database backups began falling apart. First, the Oracle DBAs started asking for hot database backups, as a nightly shutdown was no longer an option. Then the Sybase database backups that used to fit on a tape no longer did. How would we automate the swapping of tapes with a shell script?1 And finally, we had an Informix database that was so big that it required backing up to multiple tape drives simultaneously. None of the commercial backup software products available could solve these problems. Life sure got ugly for a while.
In an attempt to find a solution, I went to the biggest bookstore I could find and looked for a book on the subject. There weren't any books on the shelf, so I went to the counter to search the Books in Print database. Searching on the word backup brought up one book -- the Complete Guide to Mac Backup Management by Dorian Cougias. (Not much help for my UNIX shop.) Disillusioned, I read the backup chapters in several systems and database administration books. Even the best books covered the subject on only a cursory level, and none told me how to automate the backups of 200 UNIX machines that ran eight different flavors of UNIX and three different database products.
While I was trying to figure out how to make that happen, systems administrators around the world struggled to back up data as its size and complexity grew at an incredible rate. For a long time, there really were no good answers and very little help for people trying to protect multi-platform, multi-database, multi-terabyte data that needed to be on-line 24x7, and I think a lot of data was lost along the way.
The Good New Days
Things have finally gotten better. For one thing, backups don't get ignored anymore. In fact, one might say that backups and recoveries have come into a league of their own. They've become so complex that vendors have formed an entire industry to serve the backup and recovery needs of data centers everywhere. Even if we just count independent software vendors (ISVs) and independent hardware vendors (IHVs), there is a cornucopia of choices available on the market today. There are more than 50 backup software ISVs, and more than 15 IHVs that make tape stackers, autoloaders, libraries, silos, and optical libraries. (Links to all these vendors are available at: http://www.backupcentral.com).
The Hardware is Better
One reason backups have gotten better is that backup hardware vendors have made tremendous advancements. As shown in Table 1, when I first started using backup and recovery hardware, there were really only three vendors that were vying for the open systems, mid-range priced tape drive market: Exabyte, HP, and Digital.
In the last few years, though, the backup hardware market has heated up. As shown in Table 2, major mid-range tape drive vendors have released several new versions of their drives that have significantly increased in speed in capacity.
Although all of these drives have gotten bigger and faster to meet the demands of the market, they all have three negative things in common. First, they are all streaming drives. This means that they must be fed a constant, predictable stream of data for the drive to write at its optimum transfer rate. Since it is very difficult to achieve such a transfer rate in many environments, many of the faster drives are writing at speeds much less than their optimal speed, resulting in decreased tape capacity and decreased head and tape life due to shoe-shining.2 Therefore, some applications actually need slower tape drives.
In some cases however, applications need to send data to tape at much faster rates than is possible with these drives. This brings us to the second thing these drives have in common. Although they do have a pretty good price/performance ratio, some applications need more than these drives can deliver. Third, they have a relatively long random access time. Thus, it can take one or more minutes to access a file during a random file restore. Although this may not sound like a long time, it makes these drives completely incompatible with Hierarchical Storage Management (HSM) systems, where a file must be retrieved from media in seconds.
So, what drives solve these problems, you ask? There are actually quite a few. Each of the drives listed in Table 3 solves one or more problems that have plagued the traditional mid-range tape drive market. (Admittedly, they do not all quite fit into the mid-range price category, but there's a price for speed!)3
These drives are divided into four categories: bigger drives, faster drives, quicker drives, and slower drives. A quick drive is able to load a tape and find a file on that tape very quickly. A fast drive, on the other hand, is able to write data at a very fast rate. A slower drive, for the purposes of this article, is one that does a better job of keeping up with a slower incoming data rate than others.
Although auto-changers and tape libraries have significantly reduced the need for large-capacity tape drives, many shops cannot afford such equipment. For these shops, drive capacity is paramount. If what you need is a bigger drive, there are several to choose from in Table 3. Please remember that these are native capacities, and each of the tape drives in Figure 3 also offers hardware-based compression.
Let's consider hardware compression for a moment. Many people often discount hardware compression numbers, saying that you never get what the manufacturer claims, so you should never look at the compression numbers when comparing drives. Although you should definitely look at the native numbers first, there have been a number of benchmarks showing a definite difference in compression rates between the different compression algorithms that these drives are using. Although the actual compression rate will vary with the type of data that you have, a drive with a better compression algorithm can help -- just as the algorithms used by gzip do a better job of compressing files than the Lempel-Ziv algorithm used by the compress command.
It's amazing how fast tape drives have become in the last few years. The drives in Table 3 are ten times faster than the ones listed in Figure 1. Drives have become so fast that many applications are unable to stream them at their optimum rate. For example, although 100-Mb networks dominate most data centers today, four of the drives in Table 3 need more than 100 Mb to stream. Even if you resolve the bandwidth issue, it is difficult to send 10 MB or more across a network using a single stream of data.
The best way to take advantage of faster tape drives is to back up locally. Unfortunately, backing up locally has been traditionally very difficult to manage in a large environment. This is what Storage Area Networks (SANs) are for. SANs are an essential piece of your backup and recovery system if you want to use these faster tape drives. I will begin a multi-part series about SANs in next month's column, which will appear in the new on-line version of UNIX Review at http://www.unixreview.com).
Even when backing up locally, it is often not possible to stream one of these faster tape drives with a single stream of data. This is the job of the commercial backup software system that you select. It will take charge of sending multiple simultaneous streams of data to the same tape drive in order to supply it with enough data to stream at its optimum rate. (I described this technique in the article Backup Techniques -- Dynamic Parallelism in the February 1997 issue of Sys Admin magazine.)
There is one place in which faster drives are absolutely essential. If you've got a system with more than a few hundred gigabytes of data, you must do several things to back it up, the first of which is to buy a tape library with a whole bunch of these drives in it. Then, you will need to use parallelism to create multiple simultaneous streams of data coming from the system and going to those drives. You will also need to swap tapes throughout the process. (Thankfully, this whole process has been completely automated by commercial backup software vendors.)
Quickness is one area where tape drives have really gotten better. The drives in Figure 1 didn't even have fast file search capabilities. This meant that searching for a particular backup on a tape could take hours. (I remember having to increase the SCSI timeout value for Ultrix from 60 to 180 minutes because of this.) Searching for that same file today can occur in as fast as 23 milliseconds on optical style media, and even 8 seconds with tape media. That's wonderful, since today's HSM systems need to retrieve a file fast enough to appear almost invisible to the end user. (I'll cover HSM systems in a later article.)
Optical media previously was not an option for many people. CD-style platters stored less than a gigabyte, and platters that could store more than a gigabyte were 12 inches square! The drives, media, and libraries to manage them were extremely expensive. Although the per-MB cost of optical is still higher than tape, it is significantly less than it used to be and is now a viable option for many people who need the fast random-access rate. (Optical platters have another advantage, which is covered in the next section, Slower drives). When comparing optical drives to tape drives, note that optical platters do take 5-7 seconds to load. Most tape drives, however, don't even come close. Until recently, the quickest tape drive has been the AIT drive with a 34-second total time-to-data.4 Please note that this is twice as fast as its nearest competitor, and more than four times faster than DLT.
There's a new drive in town. The StorageTek 9840 drive (aka, Eagle) is breaking all the rules. It's got everything you could want in a tape drive starting with a 10 MB/s transfer rate and an incredibly fast time-to-data of 12 seconds. Although its 20-GB capacity may steer you toward a drive with a bigger capacity, look again. Consider that you can load this drive, search to midpoint, and unload this drive more than ten times before you can do the same thing with a DLT 8000 -- even if you include typical tape exchange times for a modern tape library. Its 4-second load time and 8-second search time make it a real contender for the random-access HSM market.
Note that all this technology does come at a price; the list price for a single 9840 drive is $27,400. (Even so, a 9840-based library might still win a cost-per-MB comparison with similarly sized optical libraries.) If you need a less expensive, quick tape drive, then AIT or AIT-2 is your best bet. Although AIT-2 has a little over half the transfer speed of the 9840 and its time-to-data value is about three times slower, it is about 1/10th of the cost.
Why is a slower drive an advancement? First, as noted above, fast drives aren't necessarily better. Many applications are unable to stream them at their optimum rate. These applications need a drive that can write at a slower rate. The drives that have recently filled this void are the Benchmark DLT1 and the Ecrix VXA-1. These drives have many of the advancements of modern tape drives, but are able to write at slower data rates -- and at a significantly reduced cost.
Benchmark Tape Systems (http://www.benchtape.com) licensed DLT technology from Quantum, and the result is their DLT1. It has the native capacity of a DLT 8000, but it writes at only 3 MB/s, or 6 MB/s with compression. At a price of $1,499, you've got yourself a perfect drive for many environments.
The VXA-1 tape drive from Ecrix (http://www.ecrix.com) has taken a completely different approach to writing data to tape. In fact, it's so different that they have videos on their Web site to explain it. They split data into packets, write them to tape, and perform multiple read-after-write comparisons to ensure that the data was successfully written to tape. The VXA's other truly new feature is a variable tape speed. Thus, it can vary the tape speed based on the incoming data rate. Although it has the same native tape speed as the DLT1, it can write at even slower rates if necessary.
Which drive is right for you? A VXA-1 drive costs 27 cents per MB, versus 37 cents per MB for the DLT1. However, DLTIV media is cheaper per MB than VXA media. Customers who are familiar with DLT may prefer the DLT1 to the VXA-1, but the VXA-1's ability to vary its incoming data rate sure sounds appealing. (One interesting feature of the DLT1 is that it would allow you to reuse your current stock of DLTIV media, where as the VXA-1 drive would require you to start over.) Only time will tell what the market thinks.
The Software is Better
Backup and recovery software has also become significantly better in the past few years. In the limited space here, let me mention just a few advancements that have become almost commonplace in today's backup and recovery products.
Several backup and recovery products can back up multiple platforms to a single master backup server, including backing up UNIX, NT, Macintosh, and mainframe systems to a UNIX server.
Auto-Discovery of File Systems
No longer must you specify a list of file systems to be backed up. Several backup products will automatically find and back up all mounted file systems.
Fully Searchable Indexes
These indexes, which are completely unavailable with dump and restore, now track the location of every file on every backup tape. When you need a file, the backup software automatically knows which tape has it.
Automated Media Management
Once the backup software knows which tape it needs, it's also completely capable of getting the tape and putting it in the drive itself, with the help of a tape library.
Many to One Parallelism
This feature takes multiple data streams and interleaves them into one continuous stream of data that is then sent to a single tape drive. This feature is required to keep today's tape drives streaming at their optimal rate.
One to Many Parallelism
This feature automatically starts multiple backups from a single client if that client has multiple file systems or raw devices to back up. Without this, it would be impossible to back up today's terabyte-sized systems.
A myriad of features are available in today's backup and software products. For a complete list of questions to ask when evaluating such products, see http://www.backupcentral.com/rfi.html.
Watch This Space
This column will be printed bi-monthly in Sys Admin magazine, and will appear monthly in the new online version of UNIX Review at http://www.unixreview.com. (Archived versions of this column will also be available at: http://www.backupcentral.com.) I will do my best to keep you informed on all the latest advancements and tricks of the trade of the backup and recovery industry.
The name for the column is the result of a contest that I sponsored at http://www.backupcentral.com, during which I received hundreds of suggestions from 76 participants. The winning contributor was Jim Haveron, and he received a free copy of my book, UNIX Backup and Recovery. (Special mention also goes to Vincent Cordrey, who submitted 134 suggestions!) It also seems appropriate to do a Top 10 list at this point taken from suggestions I received for names for this column:
Top 10 Rejected Names for this Column
1. Back Talk
2. Last Chance before God
3. Backup Boy (That's Mr. Backups, thank you!)
5. Baby's got Backup
6. Sex, Lies and Digital Tape
8. Minding the (Re)Store
9. Tape Talk
10. Back Off
1 Since we were a bank that wasn't allowed to use public domain software, we couldn't use Perl, Expect, or anything that might have actually accomplished the task. We were stuck with Bourne shell!
2 These negative effects are explained in detail in Chapter 18 of UNIX Backup & Recovery.
3 There are even a few drives that I didn't list, such as the high-end tape drives from Ampex and Sony. These drives cost even more than the ones listed in Figure 3!
4 The time-to-data values comes from adding the load time to the search time.
About the Author
W. Curtis Preston is a principle consultant at Collective Technologies (http://www.colltech.com), and has specialized in designing and implementing enterprise backup systems for Fortune 500 companies for more than 7 years. Portions of this article are excerpted from his O'Reilly book UNIX Backup & Recovery. Curtis may be reached at: email@example.com.