Backup
on a Budget
W. Curtis Preston
Not everybody has the money to buy a million-dollar storage area
network (SAN) completely dedicated to backup and recovery. Not everybody
needs a SAN! In fact, the traffic on backupcentral.com shows
that thousands of you can't even afford to buy backup software
at all. (The pages that offer help for free backup software are
among my most popular pages.) If this group of people includes you,
then I want you to know that I haven't forgotten you. In fact,
this column is just for you.
First, let's discuss some of the reasons why you might say,
"We don't have enough money for backup software."
Many of these reasons are very valid, but my personal opinion is
that some of them are not:
You Are a Small Business on a Shoestring Budget
Many businesses operate on very little money, and purchasing commercial
backup software can be very expensive. If you're in this category,
perhaps you've been ignoring the issue of backup altogether
and have buried your head in the sand. If so, I would rather you
save the money that you would spend on software and spend some of
it on better backup hardware.
You Are a Free Software Enthusiast
If you run Linux and StarOffice simply because you want to support
the concept of free software, then I'm talking about you. While
no one who uses Amanda would claim that it has all of the features
that commercial backup products offer, its capabilities are sometimes
quite amazing -- and it's free! The developers of Amanda
are as interested in advancing the cause of free software as they
are in developing a good backup product. If you're used to
compiling your own kernel, managing Amanda should be a breeze.
Backups Aren't Valued as Much as They Should Be
I've seen a lot of groups that fall under this category.
I've seen the department who can afford a million-dollar server,
but not the $50,000 library necessary to back it up. I've seen
companies who have terabytes upon terabytes of data, but balk at
the price of a backup product whose server price starts at a few
thousand dollars. My opinion is that if you can afford more than
a terabyte of data, you can afford to spend the money on a commercial
backup and recovery system. This opinion is also based on my belief
that the right commercial backup software can actually help you
get much more value out of your backup hardware. (Proving that opinion,
however, is well beyond the scope of this article.)
The lines between these groups of people are quite gray --
and quite wide. If you've been ignoring backups because you
thought they were too expensive to deal with, then this article
should really help. If you've been making it on your own with
just NTBACKUP and your own dump and tar shell
scripts, then moving up to a better free product might be the next
logical move. I'll explain your options. If you really think
you need a commercial product, but backups in your company just
don't get the respect that they deserve, then this article
probably won't help much.1
This article is not necessarily aimed at the individual user with
one machine to back up. Although much of the information in this
article can be applied to that application, this article is mainly
aimed at small businesses or the super-geek with four to five personal
machines to back up.
Hardware
People often scrimp on backup hardware, thinking that they'll
just swap tapes when it comes time to create a full backup. Please
don't scrimp here. At a bare minimum, get a tape drive, small
stacker, or autoloader that can hold one night's backup of
your entire environment. (This is a full backup as defined later
in this article.) If you buy a backup device that's too small,
you will be forced to swap tapes all the time just to get a decent
backup. (Even I hate swapping tapes!) Backups will become drudgery;
you won't want to do them; and they won't get done.
Random-Access Removable Media
Random-access removable media devices come in various forms. Zip
and Jaz drives dominated this market for quite a while, but CD-R
and CD-RW have recently become much less expensive, and such drives
are showing up everywhere. (The laptop on which I am writing this
came standard with an internal CD-R drive.) A recent addition to
this market would include the various types of rewritable DVD, but
this market has not quite taken hold yet. This is perhaps caused
by the confusion from the four competing rewritable DVD drives:
DVD-RAM, DVD-R, DVD-RW, and DVD+RW. The advantage to all random-access
removable media is access time during restores. While their transfer
rates may be much slower than their tape counterparts, you can recover
small files almost instantly because of their random access nature.
Although these devices offer very fast access to backed up data,
they all have the same limitation in common. Even a modern laptop
has more drive space than one of these devices. (My laptop also
came standard with a 30 GB hard drive.) While these devices are
fine for swapping large files, archiving data, and making MP3s,
they require too much swapping and management to be used for regular
backups of any reasonably sized company. (Depending on how you configure
your backup software, they may be perfect for the individual user,
though.)
Tape Drives
There are a lot of choices here. Believe it or not, the drives
in Table 1 range in price from less than $800 to more than $20,000.
(The drives are listed in alphabetical order.) I realize that many
may feel that a $20,000 drive does not belong in an article called
"Backups On a Budget." However, it's really difficult
to know where to draw the line. These drives and others are listed
at: http://www.backupcentral.com/hardware-drives.html, including
links to the vendors of each drive2.
Each of these drives will safely back up your data, but which
drive is right for you? That requires a little research on your
part. If you're truly limited on funds, there are several drives
in this list that are either less than or close to $1,000. The best
I can do in this article is to tell you to read up on each of these
drives using Usenet, Web searches, and discussion forums. See Table
1.
Stackers and Autoloaders
If you're trying to follow my advice above, and you're
trying to buy a single device that can hold at least one night's
backup, it's possible that none of the drives in Table 1 will
meet your needs. The next step beyond a standalone drive is to get
a tape stacker, also sometimes called an autoloader. A tape stacker
is different from a library in that it can be placed in sequential
mode. When a stacker is operating in sequential mode, ejecting a
tape causes it to put in the next tape in the magazine. Such stackers
range from six to as many as thirty tapes. These stackers are perfect
for homegrown shell scripts that don't understand how to manipulate
a standard tape library. However, once you outgrow your homegrown
shell scripts, and want to move up to a free or inexpensive backup
utility, they can be completely controlled with a free program called
mtx. mtx supports Amanda and BRU Professional. The home page for
mtx is: http://www.mtx.sourceforge.net.
A complete directory of stacker and autoloader vendors is available
at:
http://www.backupcentral.com/hardware-libraries.html#Stackers
Making Your Systems Easier to Back Up
When designing backup and recovery systems for enterprise customers,
I usually advise them to back up everything on every drive. I recommend
configuring backup software in such a way that it backs up every single
file system or drive on the system, and then customers will always
know that everything is getting backed up. This is because large environments
would actually lose money, and add risk, by being forced to proactively
monitor each system for new drives that need to be backed up. One
of the main reasons behind this belief is that the operating system
and application disks (which would typically be excluded in a customized
backup setup) comprise a very small part of the total amount of data
being backed up. What's the point in going through all of the
extra work (and risk) to exclude 1 GB of data from a 100-GB system?
However, in the small shop, I'm a little bit more flexible. The
main reason for this is that the ratio of OS and applications versus
data is usually much higher in the small shop, which is often made
up of PCs running dozens of bloated applications with very little
user data. For example, with as many documents and emails that I've
saved over the years, the "My Documents" folder is just
over 1 GB. However, the 56 applications in "Program Files"
and the operating system in WinNT add up to more than 3 GB. (We won't
talk about the 6 GB of MP3s on my D: drive, okay?)
If your shop is small enough, you can configure your systems to
save quite a bit of tape space. Here's what I do:
1. Identify any applications that require you to save their data
in their directory. As much as I like Eudora, I don't like
the fact that all mail folders must be saved underneath where you
installed Eudora. There's not much you can do to get around
this on a Windows machine, but you can usually trick a UNIX application
that does this by creating symbolic links to a common directory.
2. Move/reinstall such applications into a common directory. The
common folder on my laptop is the "My Documents" folder.
Because Eudora will not allow me to specify the location of its
mailboxes, I move and reinstall Eudora into the My Documents folder.
In the case of a UNIX application, you can often just move the data
and create symbolic links. Please consult your application vendor
before doing this.
3. Identify all applications that allow you to customize where
documents are stored. Most applications will allow you to specify
where their data is kept. To continue my Microsoft-based examples,
Tools -> Options -> File Locations will allow you to specify
which directories hold an application's data. UNIX database
applications, such as Oracle, are also good examples of applications
that allow you to specify file locations.
4. Move all other documents into a common directory. Choose a
common directory and move all of your data into it. This should
be the same directory as the one you used in Step 2.
5. Customize all applications listed in Step 3, telling them to
look for and save documents in the common folder.
If done properly, this would allow you to back up only the common
directories, and exclude the operating system and most applications.
Although this method is completely opposite to what I recommend
for large environme0nts, small environments may be able to successfully
implement this configuration, and save themselves quite a bit of
tape.
Choosing and Installing Backup Software
There are a number of free backup software packages listed at:
http://www.backupcentral.com/ toc-free-backup-software.html
If you've got a tape drive and just want a quick backup without
installing and customizing a big package, try hostdump.sh.
If you'd like a utility that can back up many gigabytes of data
using disk caching and tape libraries, then Amanda's your best
bet. Between these two programs are several other free utilities that
may also serve your needs. You can learn all about installing and
configuring Amanda by reading the book excerpt at: http://www.backupcentral.com/amanda.html.
In addition to these utilities are the commercial backup utilities
listed at: http://www.backupcentral.com/software-directory.html.
Although many of these products are very expensive, there are a few
inexpensive products that can help get you started on your way to
good backups.
Do It
I hope these suggestions will get you started on your road to
recovery. Before you do anything else, get yourself a backup of
some type using whatever you need. Once you've done that, please
do the research necessary to choose the appropriate backup and hardware
for your application. I'll provide more information next month!
1 What you need is a visit from "Mister Backup." I pity
the fool that tells me backups aren't important!
2 Both the capacity and transfer speeds listed in this table include
the vendors' stated average compression ratio -- typically
2 to 1.
W. Curtis Preston has specialized in storage for over eight
years, and has designed and implemented storage systems for several
Fortune 100 companies. He is the owner of Storage Designs, the Webmaster
of Backup Central (http://www.backupcentral.com),
and the author of two books on storage. He may be reached at curtis@backupcentral.com.
(Portions of some articles may be excerpted from Curtis's books.)
|