Cover V10, I07
Article
Table 1

jul2001.tar


Backup on a Budget

W. Curtis Preston

Not everybody has the money to buy a million-dollar storage area network (SAN) completely dedicated to backup and recovery. Not everybody needs a SAN! In fact, the traffic on backupcentral.com shows that thousands of you can't even afford to buy backup software at all. (The pages that offer help for free backup software are among my most popular pages.) If this group of people includes you, then I want you to know that I haven't forgotten you. In fact, this column is just for you.

First, let's discuss some of the reasons why you might say, "We don't have enough money for backup software." Many of these reasons are very valid, but my personal opinion is that some of them are not:

You Are a Small Business on a Shoestring Budget

Many businesses operate on very little money, and purchasing commercial backup software can be very expensive. If you're in this category, perhaps you've been ignoring the issue of backup altogether and have buried your head in the sand. If so, I would rather you save the money that you would spend on software and spend some of it on better backup hardware.

You Are a Free Software Enthusiast

If you run Linux and StarOffice simply because you want to support the concept of free software, then I'm talking about you. While no one who uses Amanda would claim that it has all of the features that commercial backup products offer, its capabilities are sometimes quite amazing -- and it's free! The developers of Amanda are as interested in advancing the cause of free software as they are in developing a good backup product. If you're used to compiling your own kernel, managing Amanda should be a breeze.

Backups Aren't Valued as Much as They Should Be

I've seen a lot of groups that fall under this category. I've seen the department who can afford a million-dollar server, but not the $50,000 library necessary to back it up. I've seen companies who have terabytes upon terabytes of data, but balk at the price of a backup product whose server price starts at a few thousand dollars. My opinion is that if you can afford more than a terabyte of data, you can afford to spend the money on a commercial backup and recovery system. This opinion is also based on my belief that the right commercial backup software can actually help you get much more value out of your backup hardware. (Proving that opinion, however, is well beyond the scope of this article.)

The lines between these groups of people are quite gray -- and quite wide. If you've been ignoring backups because you thought they were too expensive to deal with, then this article should really help. If you've been making it on your own with just NTBACKUP and your own dump and tar shell scripts, then moving up to a better free product might be the next logical move. I'll explain your options. If you really think you need a commercial product, but backups in your company just don't get the respect that they deserve, then this article probably won't help much.1

This article is not necessarily aimed at the individual user with one machine to back up. Although much of the information in this article can be applied to that application, this article is mainly aimed at small businesses or the super-geek with four to five personal machines to back up.

Hardware

People often scrimp on backup hardware, thinking that they'll just swap tapes when it comes time to create a full backup. Please don't scrimp here. At a bare minimum, get a tape drive, small stacker, or autoloader that can hold one night's backup of your entire environment. (This is a full backup as defined later in this article.) If you buy a backup device that's too small, you will be forced to swap tapes all the time just to get a decent backup. (Even I hate swapping tapes!) Backups will become drudgery; you won't want to do them; and they won't get done.

Random-Access Removable Media

Random-access removable media devices come in various forms. Zip and Jaz drives dominated this market for quite a while, but CD-R and CD-RW have recently become much less expensive, and such drives are showing up everywhere. (The laptop on which I am writing this came standard with an internal CD-R drive.) A recent addition to this market would include the various types of rewritable DVD, but this market has not quite taken hold yet. This is perhaps caused by the confusion from the four competing rewritable DVD drives: DVD-RAM, DVD-R, DVD-RW, and DVD+RW. The advantage to all random-access removable media is access time during restores. While their transfer rates may be much slower than their tape counterparts, you can recover small files almost instantly because of their random access nature. Although these devices offer very fast access to backed up data, they all have the same limitation in common. Even a modern laptop has more drive space than one of these devices. (My laptop also came standard with a 30 GB hard drive.) While these devices are fine for swapping large files, archiving data, and making MP3s, they require too much swapping and management to be used for regular backups of any reasonably sized company. (Depending on how you configure your backup software, they may be perfect for the individual user, though.)

Tape Drives

There are a lot of choices here. Believe it or not, the drives in Table 1 range in price from less than $800 to more than $20,000. (The drives are listed in alphabetical order.) I realize that many may feel that a $20,000 drive does not belong in an article called "Backups On a Budget." However, it's really difficult to know where to draw the line. These drives and others are listed at: http://www.backupcentral.com/hardware-drives.html, including links to the vendors of each drive2.

Each of these drives will safely back up your data, but which drive is right for you? That requires a little research on your part. If you're truly limited on funds, there are several drives in this list that are either less than or close to $1,000. The best I can do in this article is to tell you to read up on each of these drives using Usenet, Web searches, and discussion forums. See Table 1.

Stackers and Autoloaders

If you're trying to follow my advice above, and you're trying to buy a single device that can hold at least one night's backup, it's possible that none of the drives in Table 1 will meet your needs. The next step beyond a standalone drive is to get a tape stacker, also sometimes called an autoloader. A tape stacker is different from a library in that it can be placed in sequential mode. When a stacker is operating in sequential mode, ejecting a tape causes it to put in the next tape in the magazine. Such stackers range from six to as many as thirty tapes. These stackers are perfect for homegrown shell scripts that don't understand how to manipulate a standard tape library. However, once you outgrow your homegrown shell scripts, and want to move up to a free or inexpensive backup utility, they can be completely controlled with a free program called mtx. mtx supports Amanda and BRU Professional. The home page for mtx is: http://www.mtx.sourceforge.net.

A complete directory of stacker and autoloader vendors is available at:

http://www.backupcentral.com/hardware-libraries.html#Stackers
Making Your Systems Easier to Back Up
When designing backup and recovery systems for enterprise customers, I usually advise them to back up everything on every drive. I recommend configuring backup software in such a way that it backs up every single file system or drive on the system, and then customers will always know that everything is getting backed up. This is because large environments would actually lose money, and add risk, by being forced to proactively monitor each system for new drives that need to be backed up. One of the main reasons behind this belief is that the operating system and application disks (which would typically be excluded in a customized backup setup) comprise a very small part of the total amount of data being backed up. What's the point in going through all of the extra work (and risk) to exclude 1 GB of data from a 100-GB system? However, in the small shop, I'm a little bit more flexible. The main reason for this is that the ratio of OS and applications versus data is usually much higher in the small shop, which is often made up of PCs running dozens of bloated applications with very little user data. For example, with as many documents and emails that I've saved over the years, the "My Documents" folder is just over 1 GB. However, the 56 applications in "Program Files" and the operating system in WinNT add up to more than 3 GB. (We won't talk about the 6 GB of MP3s on my D: drive, okay?)

If your shop is small enough, you can configure your systems to save quite a bit of tape space. Here's what I do:

1. Identify any applications that require you to save their data in their directory. As much as I like Eudora, I don't like the fact that all mail folders must be saved underneath where you installed Eudora. There's not much you can do to get around this on a Windows machine, but you can usually trick a UNIX application that does this by creating symbolic links to a common directory.

2. Move/reinstall such applications into a common directory. The common folder on my laptop is the "My Documents" folder. Because Eudora will not allow me to specify the location of its mailboxes, I move and reinstall Eudora into the My Documents folder. In the case of a UNIX application, you can often just move the data and create symbolic links. Please consult your application vendor before doing this.

3. Identify all applications that allow you to customize where documents are stored. Most applications will allow you to specify where their data is kept. To continue my Microsoft-based examples, Tools -> Options -> File Locations will allow you to specify which directories hold an application's data. UNIX database applications, such as Oracle, are also good examples of applications that allow you to specify file locations.

4. Move all other documents into a common directory. Choose a common directory and move all of your data into it. This should be the same directory as the one you used in Step 2.

5. Customize all applications listed in Step 3, telling them to look for and save documents in the common folder.

If done properly, this would allow you to back up only the common directories, and exclude the operating system and most applications. Although this method is completely opposite to what I recommend for large environme0nts, small environments may be able to successfully implement this configuration, and save themselves quite a bit of tape.

Choosing and Installing Backup Software

There are a number of free backup software packages listed at:

http://www.backupcentral.com/ toc-free-backup-software.html
If you've got a tape drive and just want a quick backup without installing and customizing a big package, try hostdump.sh. If you'd like a utility that can back up many gigabytes of data using disk caching and tape libraries, then Amanda's your best bet. Between these two programs are several other free utilities that may also serve your needs. You can learn all about installing and configuring Amanda by reading the book excerpt at: http://www.backupcentral.com/amanda.html. In addition to these utilities are the commercial backup utilities listed at: http://www.backupcentral.com/software-directory.html. Although many of these products are very expensive, there are a few inexpensive products that can help get you started on your way to good backups.

Do It

I hope these suggestions will get you started on your road to recovery. Before you do anything else, get yourself a backup of some type using whatever you need. Once you've done that, please do the research necessary to choose the appropriate backup and hardware for your application. I'll provide more information next month!

1 What you need is a visit from "Mister Backup." I pity the fool that tells me backups aren't important!

2 Both the capacity and transfer speeds listed in this table include the vendors' stated average compression ratio -- typically 2 to 1.

W. Curtis Preston has specialized in storage for over eight years, and has designed and implemented storage systems for several Fortune 100 companies. He is the owner of Storage Designs, the Webmaster of Backup Central (http://www.backupcentral.com), and the author of two books on storage. He may be reached at curtis@backupcentral.com. (Portions of some articles may be excerpted from Curtis's books.)