Cover V08, I04
Article
Figure 1

apr99.tar


Network Backup Configuration: A More Complex Scenario

Ira Goodman

In the article "Network Backup Configuration: A Primer," which appeared in Sys Admin in November 1998, I discussed the fundamentals of developing a network backup configuration and designed a configuration for a sample site called BCD Incorporated. In this article, I will extend the design by adding several complexities to the situation and apply the same strategy.

First, I'll review the situation at BCD where the FDDI network is quickly becoming a bottleneck. An excellent solution for the problem is subnetting, which fits in perfectly with a distributed backup strategy.

Subnetting will also provide an easy solution to another problem the backup administrator is facing at BCD, a merger of his company with the WXY Corporation. The admin can merely add the WXY site, which has a centralized backup configuration, to the BCD site as another subnet.

Although there are several complexities to discuss, we will be able to apply the same flexible procedure that we used to configure the BCD site last time with a few minor alterations. The first two steps will be applied to the WXY site:

  • Inventory hardware and software.
  • Document backup requirements.

The next four steps will be applied to the merged site (now called the BCD-WXY Company):

  • Choose the backup servers.
  • Distribute the backup devices.
  • Consider media handling for the merged site.
  • Set a schedule for the merged site.

Following these steps will help clarify the backup situation at WXY and at the company that will exist after the merger. They will also allow the BCD-WXY Company to achieve its primary goal: completing a thorough backup as quickly and efficiently as possible and restoring backup data in a timely fashion.

Subnetting Breaks Bottleneck

The elements shown in blue in Figure 1 present the configuration developed last time for BCD Inc. All the machines were arbitrarily grouped by operating system, and backup data was routed to local backup devices. However, all the backup data still had to travel across the FDDI network to reach the target devices. The data on the UNIX machines traveled over the network to reach an automated tape library. The data on the NT machines traveled over the network to a DLT 4000, and the data on the NetWare machines traveled over the network to a DLT 2000XT machine.

In the subnetting scheme illustrated by the dotted line in Figure 2, each of the groups becomes a subnet with a gigabyte switch. This strategy confines backup data traffic to the subnet and keeps the backup data off the main FDDI network, yet it allows messages to pass between the device servers on the subnets and the master server, an RS/6000 machine. In addition, the master server is now connected to a DLT 4000. This allows the backup catalog and any other data on the master server to be backed up directly to the DLT 4000, again keeping it off the main FDDI network.

The machines running UNIX are connected to switch A, the machines running Windows NT to switch B, and the machines running NetWare to switch C. The only difference between the subnets is that each UNIX machine has its own network interface connecting it separately to the device server. The device server in turn is connected to the automated tape library. This allows all three machines, two of which have high concentrations of data, to send backup data to the device server more efficiently.

Subnetting makes configuring a backup strategy for the merger easy. The entire WXY network with its centralized backup configuration can be connected quickly and easily to the main BCD FDDI network as a subnet. Backups can begin immediately, while network administrators decide which departments they want to consolidate or organize differently. This article deals only with the aspects of the merger related to backup processing immediately after the merger. Other aspects, such as consolidation of departmental functions and data standardization, are beyond its scope.

Hardware and Software Inventory at WXY Corp.

The first step in creating an efficient configuration of any site is an inventory of the hardware and software at that site. In the previous article, we made an inventory for BCD, and now we will do one for WXY.

The WXY Corporation has UNIX, Windows NT, and NetWare machines, just as BCD does. However, the company uses different UNIX operating systems, and one of the NT machines is considerably larger than those at BCD because it holds a data warehouse. The elements shown in red in Figure 1 represent the original configuration of the WXY site, which is set up for centralized backup.

Here are the machines at WXY, grouped by operating system:

UNIX - One SGI server running IRIX 6.5 with 100 GB of data, and one Alpha server running Digital UNIX with 300 GB of data. This second machine holds the company database.

NetWare - One NetWare machine running release 4.11 with 10 GB of data. This machine is used exclusively for printing services.

Windows NT - Three machines running Windows NT 4.0, each with a different hard disk capacity: 200 GB, 36 GB, and 18 GB. The largest machine holds the company's data warehouse, which consists of data extracted from the company database and aggregated. The data warehouse engine has recently been upgraded to Oracle 8, and the extracted data is staged on this machine and loaded once a week.

Backup Devices at WXY

WXY owns one very large automated tape library with 120 slots and six 8-mm AIT tape drives. Each slot has a capacity of 25 GB, and each drive has a transfer rate of 3 MB/sec (6 MB compressed).

Finally, WXY has an ATM network with a bandwidth of 75MB/sec (264 GB/hour).

WXY's Backup Requirements

As was the case with BCD, WXY prefers "lights out," unattended backup, if possible. However, their centralized backup strategy, even with a powerful ATM network and a large automated library, is beginning to experience scheduling problems that cannot be solved easily with a centralized backup configuration. The main difficulty is that WXY has several European subsidiaries that require access to the WXY network during their business day, resulting in a smaller backup window than at BCD. These international agents would like as much access as possible and are expected to request it soon for their entire business day, because their sales volume is increasing rapidly.

Finally, the backup product currently in use at WXY will become a legacy product when the merger with BCD takes place. This requires the maintenance of two backup products until the last legacy backup tape retention period expires, approximately six months after the merger takes place.

Here then are the basics:

  • Total amount of data to be backed up: 400 GB on UNIX, 236 GB on Windows NT, and 18 GB on NetWare. Sum total: 654 GB.

  • Amount of data that changes daily: Initial estimate of 20%.

  • Current backup window: 10 p.m. to 1 a.m. Monday to Friday. The system is also available for 15 hours for backup on Sundays from 10 a.m. to 1 a.m.

  • Special types of files that must be backed up: Some large database tables and large data warehouse tables. The backup of small files on the Solaris machine at BCD will be phased out because WXY's more automated order processing system will be adopted. However, the files and the ability to restore them will have to be available for several years after the merger. This affects the availability of some hardware resources, but not the backup schedule.

  • Data retention requirements: Six weeks for most data. The first backup of each month is held off site and retained for six months.

  • Future organization-wide plans: Merger with BCD Inc. is imminent. This will result in the need to maintain a legacy backup product for at least six months. The backup window of WXY will be adopted, and a compromise retention period of one month will be instituted for all backups.

The Master Server at BCD-WXY

The current master server at WXY Corp. is a Digital Alpha machine. It is running Digital UNIX, and is directly connected to an 8-mm AIT automated tape library. When the merger between BCD and WXY takes place, the master server at BCD, an RS/6000 running AIX, will immediately become the master server at the combined BCD-WXY site.

Because of the resources needed to condense the backup catalog, especially a catalog that is expected to grow considerably when the BCD and WXY sites are merged and consolidated, the RS/6000 was chosen. It is a powerful machine that is unencumbered with additional processing duties. Additionally, it can assume backup duties for the merged company immediately because it is already set up for distributed processing. The WXY servers and devices need only receive appropriate software and be identified to the backup product on the designated master server for backup processing to be initiated when the merger is finalized.

Device Servers at the BCD-WXY

All the device servers at the BCD site will remain in place after the merger. The current master server at the WXY site will become the device server for the AIT library in the distributed backup configuration of the new company.

Since the amount of data on the Digital UNIX machine is expected to grow considerably, a new 8-mm AIT library was ordered for data warehouse and other types of backup. The NT machine with the Oracle 8 data warehouse will become the device server for the new AIT library, and this change in configuration will help to balance backup processing, improve performance, and add scheduling flexibility.

Media Handling at BCD-WXY

The tape formats of backup devices at BCD and WXY are incompatible; BCD has DLT drives, while WXY has 8-mm.

Although this incompatibility will not affect the transition to BCD's distributed backup product after the merger, care will have to be taken to ensure that enough backup devices of each type will be available for restoring data. Data backed up on DLT tape must be restored on DLT drives and data on 8-mm tapes to 8-mm drives.

Setting a Schedule for BCD-WXY

Scheduling "lights out" backup will be more complex at the BCD-WXY because of several factors:

  • Reduced backup window
  • Larger incremental backups (up from 10% to 20%) and an increased rate of data growth (estimated to be 40% per year for the merged company)

One factor that does not seem to be a problem in the merged configuration is network bandwidth - at least for backup. Since subnetting is now in place and only messages will travel over the FDDI network during backup, its relatively small bandwidth of 36 GB/hour is no longer a concern. The ATM network on the subnet that was formerly the network for WXY will remain the network for the WXY subnet. The bandwidth of 264 GB/hour appears more than adequate, especially with the addition of a second automated library on that subnet.

Subnetting has also removed the necessity of compressing data to reduce the amount of data traveling over the network. The advantage of compression is that it generally cuts in half the amount of data that travels across a network and the amount of data that needs to be stored on tape. However, because compression can be a very time-consuming and resource-intensive process, it must be used very selectively.

Since the transfer rate of the backup devices seems to be the only limiting factor, we will note them, and then calculate the amount of time needed for the backup of each client and server machine on the merged network.

Transfer rates:

  • DLT7000 = 5 MB/sec (35 GB capacity)
  • DLT4000 = 1.5 MB/sec (20 GB capacity)
  • DLT2000XT = 1.25 MB/sec (15 GB capacity)
  • 8-mm AIT = 3MB/sec (25 GB capacity. 35 GB with long tapes)

Capacity is important because "lights out" backup demands that no tapes be changed in individual devices. Although automated libraries will handle tapes automatically, there must be sufficient tapes and slots in the library when the backup is initiated. Note that AIT now has a capacity of 35 GB, on par with the capacity of DLT7000 tapes, but early AIT devices will require a firmware update to handle the longer tapes.

Estimated backup time for the BCD machines:

  • RS/6000 master server and device server - 0.5 hour
  • RS/6000 - 3 hours
  • HP9000 - 3 hours
  • HP9000 database server - 3 hours (cold backup)
  • Solaris - 0.5 hour
  • Windows NT (all machines) - 3 hours
  • NetWare (all machines) - 2.5 hours

Estimated backup time for the WXY machines:

  • Digital UNIX database server - 10 to 15 hours (hot backup)
  • SGI - 2 hours
  • NetWare - 0.5 hour
  • Windows NT WebServer - 0.5 hour
  • Windows NT Email - 0.5 hour
  • Windows NT Data Warehouse - 5 hours (cold backup)

According to our calculations, which are based on a worst case scenario, backup processing should fit comfortably within BCD-WXY backup window. However, before describing the scheduling in some detail, let's look quickly at the question of hot and cold database backup.

Hot or Cold Database Backup?

Some sites never need to decide between an online ("hot") and an offline ("cold") database backup. Any site that must provide 24-hour access to a database must back up that database while it is online.

Hot database backup can take twice as long as a cold one. Additionally, a cold backup is very "clean," that is, all tables are backed up at the same time. With a hot backup, the data in each table is backed up at a different time.

A specific example will help us understand the problem. The online backup of a 500 GB database to a DLT7000 library with four drives will take up to 18 hours. A cold backup will take 7 hours. However, the database will be available for all of the 18 hours during which the hot backup is running. The cold backup is faster, but it also means 7 hours without database access for all users. If an organization can tolerate the inaccessibility of its database for such a long time, say, on a weekend, an offline backup is always preferable to an online one. However, fewer and fewer organizations can live with such a situation, and, because of this, online backup is growing in popularity.

Data compression is a similar issue. Data compression is very CPU-intensive, and compressing data only for performance or scheduling reasons may not be justified. However, in some cases, factors such as cartridge capacity, which affects whether or not a backup can be "lights out," may justify using the additional resources to compress the data. Compressed data can eliminate a cartridge change.

WXY opted for online backup for its main database because it wants to provide as much access time for its salespeople and foreign agents as possible. However, the data warehouse machine is backed up offline because it is used only by the marketing department in Massachusetts. The warehouse is reloaded only once a week, but data is downloaded and staged on the data warehouse server daily.

Detailed Schedule

BCD-WXY will follow a schedule of base and incremental backups as soon as the two companies merge. Because of subnetting, all the base backups of the former BCD Corporation can be run simultaneously, and will complete well before the end of the Sunday backup window.

The heaviest concentration of data, which is on the UNIX subnet, should take about 9.5 hours. The backup of the HP9000 can be scheduled between the backups of the other two large servers (the RS/6000 and the other HP9000, both with 100 GB of data) in case any extra time is needed to take the database offline or bring it online after backup.

The incremental backup of the database server should also not be a problem since incremental backup at BCD was only 10% of the data, that is, about a half hour. Even at 20%, the incremental backup would be completed in just under one hour, well within the weekday window.

Since a hot backup does not need to be scheduled within the backup window because the database is always available, scheduling on the WXY subnet should not present any problems. A cold backup of the database server is now an option because the backup of all machines except the data warehouse machine (a total of 8.5 hours for base backup) would fit easily within the backup window of 15 hours on Sunday. However, if management insists that the database always be available, the other machines can be backed up first, which would take approximately 3.5 hours, followed by the hot database backup, which could then be allowed to run until completion. The hot backup should end by 5 a.m., and that is well before the incremental backup window at 10 p.m. on Monday night.

Distributed backup is important here. The ability to connect the data warehouse machine to a separate backup device saves five hours in backup time, allowing the hot backup to complete earlier. And since only a base backup is done on the data warehouse, the other machines on the network can back up their data incrementally to the library attached to the warehouse during the week, allowing the hot incremental backup of the database to complete earlier.

Retention periods should pose no problems, especially because the small files on the Solaris machine, which must be retained for three months, will very likely be phased out soon after the merger.

Summary

A distributed strategy, both in networking and backup, allows a great deal of flexibility in backup configuration. Backup processing can be isolated within subnets, allowing far more multi-tasking and concurrent processing than a network without subnetting. A backup product that allows distributed backup configuration permits backup devices to be connected where they are needed most, facilitating scheduling within tight backup windows.

BCD Incorporated had a distributed backup strategy, but its FDDI network had become a bottleneck because all data had to travel over the network in order to be backed up, even though the devices were distributed. This bottleneck was broken with subnetting.

Subnetting also made backup configuration planning easy for a merger between BCD and the WXY Corporation. The WXY site was added as a subnet to the BCD network for purposes of backup. A distributed backup strategy also allowed a backup device to be added with ease, preventing scheduling problems for the required hot backup at the newly merged site.

About the Author

Ira Goodman is a software services manager at Syncsort Inc. He has 24 years experience in information processing, and has managed backup product support for almost a decade. He currently manages support for Backup Express, a backup product for networks running UNIX, NT and Netware, and advises customers on the setup and maintenance of distributed backup systems and automated libraries.