Network Backup Configuration: A Primer
Configuring an enterprise-wide backup is especially challenging in network environments because of the many variables involved. Most sites have a unique mix of hardware and software components, which sometimes interact in unexpected ways. Additionally, sites often have special backup and restore requirements. Yet the goal for all sites is the same: complete a thorough backup as quickly and efficiently as possible and be able to restore backup data in a timely fashion.
The best methodology for approaching such a complex problem is, of course, a systematic one. Here is the basic procedure:
- Inventory hardware and software
- Document backup requirements
- Choose the backup servers
- Distribute the backup devices
- Consider media handling
- Set a schedule
In this article, I will discuss these five steps and configure a backup for a sample site, which I'll call BCD Inc. This site is a mixed environment, that is, data must be backed up from machines running three very different operating systems. It also has a variety of devices to which data can be backed up, and utilizing all of them requires a distributed approach.
However, the solution I have proposed for BCD Inc. is an interim one. It is a basic scheme, designed to illustrate the fundamental ideas in network backup configuration. It is also the kind of scheme with which sites "graduating" to a commercial-strength backup would begin.
Complex and very cost-effective distributed backup schemes can be built on the simple solution suggested for BCD in this article. For example, subnetting could easily be added with a small investment in switching technology. With subnetting, more backups could be run in parallel, providing greater speed and scheduling flexibility. However, subnetting and other more sophisticated strategies are beyond the scope of this primer.
Inventory Hardware and Software
If you don't have a detailed inventory of your current equipment, now is the time to develop one. The more detailed it is, the more solid and reliable your backup configuration will be. A thorough inventory also lets you troubleshoot your backup and plan for future growth much more easily.
Note the following about each machine:
- Number of processors and their speed
- RAM size
- Hard disk capacity and current free space
- Physical location
- Operating system
- OS level
- OS release
- Applications running on it
You must also survey the backup devices you have on hand. For standalone tape drives, you'll need their capacity and transfer rate. If you have an automated library, you will also need to know the number of storage slots, number of drives, and number of SCSI connections. You must also know the bandwidth of your network.
Document Backup Requirements
Finally, you should document the backup requirements for your organization. Here are the basic things you must know:
- Total amount of data to be backed up
- Amount of data that changes daily
- Current backup window
- Any special types of files and databases
- Data retention requirements
- Future organization-wide plans
Let's consider this information in the context of a specific example.
Example: Hardware and Software Inventory at BCD Inc.
Our sample site has machines running some of the most popular operating systems in use today. Because of space limitations, I have only included the basic information we need to do a configuration, and not all the information I would record in an inventory. Here are the machines at BCD, grouped by operating system:
- Two RS/6000 running AIX 4.x, one with 10 GB of data and one with 100 GB
- Two HP9000, both running HP-UX 10.x with 100 GB of data
- One Sun server running Solaris 2.x with 10 GB of data
Two machines in the UNIX group have special requirements. One of the HP9000 machines is running an Oracle database, which can be shutdown for a "cold" backup. Because we want the database to be down for the shortest time possible, we will compress the data for backup, which will reduce it by approximately 50%.
The Sun server is running a special application that forces it to store data in very small files. This data also has a different retention period, described below.
It's always important to note the location of any very large files, especially database files, and any concentration of very small files because both types can require special handling to improve performance.
- Five NetWare machines running release 4.11, each with 2 GB hard drives.
- Three machines running Windows NT 4.0, each with a different hard disk capacity:6.4 GB, 2 GB, and 8.2 GB.
Example: Backup Devices at BCD
BCD owns three backup devices:
- An automated tape library with 40 slots and two DLT7000 tape drives. Each slot has a capacity of 35 GB, and each drive has a transfer rate of 5 MB/sec (10 MB compressed).
- A DLT4000 tape device with a 20 GB capacity and a transfer rate of 1.5 MB/sec.
- A DLT2000XT tape device with a 15 GB capacity and a transfer rate of 1.25 MB/sec.
Finally, BCD has an FDDI Ring network with a bandwidth of 10MB/sec (36 GB/hour).
Example: BCD's Backup Requirements
BCD has no extraordinary backup requirements, but the goal of their system administrator is to provide "lights out" backup whenever possible. A "lights out" backup is one that requires no tape handling or operator assistance after hours.
Here are the basics:
- Total amount of data to be backed up: 340 GB on UNIX, 16.6 GB on Windows NT, and 10 GB on NetWare. Sum total: 366.6 GB.
- Amount of data that changes daily: initial estimate of 2%.
- Current backup window: 8 p.m. to 5 a.m. Monday to Friday, and the system is also available for backup on weekends.
- Special types of files that must be backed up: some large database tables and a concentration of very small files on UNIX machines.
- Data retention requirements: 3 weeks for all data except the concentration of very small files on the Solaris machine. These must be retained for 3 months.
- Future organization-wide plans: data is expected to grow at a rate of about 10% per year. However, merger talks are in progress, and backup data may grow considerably if the merger takes place. A data warehouse would also then be a possibility. Any new machines at the current site will very likely be running Windows NT because that area (marketing) is growing rapidly. The UNIX machines have plenty of spare capacity.
Choosing the Master Server
The first step after completing your hardware and software inventory and surveying your backup requirements is choosing your master server. This machine is the most important one in your backup configuration because it holds the backup product, the backup schedule, and the backup catalog.
Because servers today are attractively priced, many sites are able to dedicate a machine to backup processing and designate it as the master server. If you are buying a master server or looking for a candidate among your current machines, these are the characteristics it should have:
- The fastest machine (that is, the one with the highest megahertz rating) that you can afford or that is available
- At least 128 MB of RAM
A hard disk of 8 GB or more
If you are choosing from machines you already have, select the one with the least activity on it, that is, the fewest and smallest applications. Although a solid backup product will not need a lot of processing power for normal backup and restore jobs, catalog-related jobs can be very resource-intensive.
Catalog space is another consideration. Generally, you'll need disk space equivalent to 1% of the data you are backing up. The actual catalog will take up half of this space (0.5%), but you will need another 0.5% to perform catalog maintenance operations. Routinely maintaining the catalog is very important in controlling catalog size and improving catalog-related performance.
Example: The Master Server at BCD
The backup administrator at BCD chose the RS/6000 server with the fewest applications running on it as the master server. It is a powerful machine with a lot of disk space available, because it currently holds only 10 GB, far below its capacity.
Choosing Device Servers
A device server is any machine that has a backup device directly connected to it. The master server may also serve as a device server.
While it's best to choose a master server with as little data on it as possible, it's often advisable to use the opposite strategy when choosing device servers. The reason is simple. The data on a device server moves directly to the backup device and stays off your network.
If you are grouping machines (i.e., planning to have a set of machines back up all their data to one device), you should choose the most powerful machine in the group, which may often be the one with the largest amount of data.
Security might also be a consideration. If a machine in a group holds a lot of sensitive data, consider making it a device server. In this way, you can often avoid a resource-intensive encryption procedure before the data moves across the network to the backup device.
The tape formats in backup devices can have significant differences in capacity and throughput. A simple comparison will make this clear. DDS format, which has been on the market for several years, is available as DDS-2 and DDS-3. The maximum native (uncompressed) capacity of DDS-2 media is about 4 GB per tape, and its throughput is a maximum of 0.5 MB/sec. A DDS-3 tape jumps to 12 GB maximum native, and throughput to 1.2 MB/sec maximum.
The difference between DDS-2 and DDS-3 is already considerable. However, if you choose devices that use DLT format, you can use DLT7000, which has a maximum capacity of 35 GB native, and a throughput maximum of 5 MB/sec. Compression doubles both capacity and throughput.
A high tape density allows each tape to hold more data, thereby minimizing tape changes. You can also buy a large-capacity automated library with a very high tape density, but with only minimum slots configured. Then you can add slots as your data grows, allowing you to do "lights out" backup indefinitely.
Example: Configuration for BCD
After surveying the machines and backup devices at BCD, the following decisions were made. UNIX machines are the current workhorses at BCD. They hold the largest concentration of data, and they run an Oracle database. It seems logical to group these and back up their data to the automated tape library with its DLT7000 drives. The master server would then be a device server too, with the library directly connected to it.
The Windows NT machines hold the second largest concentration of data, 14.6 GB, and this type of machine is growing the fastest. Thus, it seems natural to group these machines and make the largest and most powerful NT machine a device server. The second most powerful device, the DLT4000 with its 20 GB capacity, could then be connected to it. Since a device server holds almost 50% of the data in this group, a sizeable amount of data would not have to travel across the network during backup.
The five NetWare machines also seem to fit naturally into a group. The data on them (10 GB) is not expected to grow, and would easily fit on a tape in the smallest backup device, the DLT2000XT, which has a capacity of 15 GB.
With this configuration, it is possible to run "lights out" backups. The tapes in the DLT4000 and DLT2000XT devices would only have to be changed once a week if base and incremental backups are used (unless data must be restored, of course). These types of backup and scheduling are discussed below.
Setting a Schedule
The term "backup window" generally refers to the amount of quiet time a network experiences in the evening and on weekends for backup. In BCD's case, the backup window during the week is 9 hours per day, and the weekends are also available. Normally, backups are not scheduled during the day because the network is too busy.
There are three basic types of backup:
Base backups are complete backups of all the specified data on the machines on a network, normally performed once each weekend. However, distributing devices allow base backups to be scheduled whenever it is convenient, for example, on week nights.
On week nights, most sites run incremental or differential backups. Incrementals back up only the data that has changed since the last backup of any kind. Differentials back up the data that has changed since the last base backup.
Incremental or Differential?
The choice between incremental and differential backups often depends on how fast a system must be restored if a major loss occurs, how large a system is, and device capacity.
When backing up a very large network, you might have to restore from a considerable number of tapes if you choose to do a weekly base and daily incremental backups. You would have to restore from the base backup tapes and then each set of incrementals, the number of which would vary depending on the number of days since the base. On the other hand, to restore from base and differential backups, you would need only the base tapes and the final set of differential tapes. If four or five days elapsed between the loss and the base backup, a base/differential restore would be significantly faster because fewer tapes would have to be handled.
The time backups take to run should also be considered. If 10% of the files at a site change per day, differential backups grow at this rate as the time between the differential and base lengthens. A full set of incremental backups (or the final differential backup) would then contain about 50% of the data in the weekly base backup. Since each incremental backup generally contains 10% of the base data, and a differential backup may contain as much as 50%, incremental backups are almost always faster to run.
A second consideration is "lights out" backup. A base of 100 GB with a file change rate of 10% per day will mean that a base plus a set of five incremental backups will contain about 150 GB of data. However, the same base plus a set of five differential backups may contain 250 GB. If you have a lot of data or small capacity backup devices, choosing differential over incremental can substantially increase the amount of tape handling and prevent "lights out" backups.
Example: Setting BCD's Schedule
Since "lights out" backup is more important at BCD than a very fast restore, the company has chosen base and incremental backups. A full set of incremental backups is equal to 10% of the base backup at BCD, so using incrementals allows all of the data in the NT and NetWare groups to be backed up to a single tape each week with some room for data growth. Although it would be best to stagger the incremental backup schedule for balanced resource utilization, the current set of system-wide incrementals could be run simultaneously because a relatively small proportion of the files change at BCD.
Under the proposed configuration, network bandwidth (36 GB/hour) is the only major scheduling bottleneck, and then only in the case of the large UNIX machines. Network compression would cut the amount of data traveling over the network by half, but it is a time-consuming and resource-intensive process. Thus, BCD chose only to compress data on the database server. To ensure that the database would be down for as little time as possible, BCD chose to backup the database tables as a "raw partition." This means they use a backup strategy that backs up data physically byte by byte but does not process the logical file structure information on the disk. Additionally, since the backup configuration at BCD is distributed, the machine holding the database could easily become a device server with a backup device directly connected to it. This would speed up the backup in general and allow the backup of the database machine to run concurrently with other backups. It would also keep the database table data off the network during backup.
Generally, the amount of time needed to back up data is limited by either network bandwidth (in this case, 10 MB/sec) or the transfer rate of the backup device (1.25 MB/sec for the DLT2000XT; 1.5 MB/sec for the DLT4000; 5 MB/sec for the DLT7000). Device servers are limited only by the device transfer rate, while client machines are limited either by the network or the device. With these limitations in mind, here are the approximate times for the base backups for each machine at BCD:
- RS/6000 master server and device server - 0.5 hour
- RS/6000 - 3 hours
- HP9000 - 3 hours
- HP9000 database server - 1.5 hours compressed
- Solaris - 0.5 hour
- Windows NT (all machines) - 3 hours
- NetWare (all machines) - 2.5 hours
A total of 14 hours is needed for the base, so this backup could be run on Saturday and Sunday, or the base for different machines could be run overnight on two or more nights during the week. The current configuration provides flexibility for the backup administrator.
Retention periods must also be taken into consideration in some cases. At BCD, the retention period for backups is three weeks (except for the small files on the Solaris machine, which must be retained for three months). Backing up the Solaris machine to the automated library would require a "logical tape pool" with six1 slots dedicated to this small file backup within the library, and would allow all backup processing to be "lights out." If the Solaris machine were backed up to the DLT4000, tapes would have to be changed frequently because of the different retention periods.
Initially, configuring a network backup seems to be a complex task. However, if you carefully follow the procedure described in this article, a variety of solutions will become apparent. Also important is a backup product that allows you to distribute backup devices to improve flexibility, performance, and efficiency.
By following the general guidelines described in this article, we developed a flexible, "lights out" solution for our sample site that will work very well in the short term and allow for future growth.
1 Figuring that incremental backups will be 15% of the base in this example, the total amount of data backed up each week is 17.5 GB, multiplied by 12 weeks, for a total of 210 GB. Since a DLT7000 tape has a capacity of 35 GB (210 GB/35 GB), 6 slots will be needed.
About the Author
Ira Goodman is a software services manager at Syncsort Inc. He has 24 years experience in information processing, and has managed backup product support for almost a decade. He currently manages support for Backup Express, a backup product for networks running UNIX, NT and Netware, and advises customers on the setup and maintenance of distributed backup systems and automated libraries.