Storage Consolidation -- Part 3 -- Implementation Details

Peter Baer Galvin

Last month, my Solaris Companion column discussed product selection. Storage arrays, SAN switches, tape drives and libraries, and replication all have options to consider and tradeoffs to weigh. This month, the rubber meets the road as my column addresses implementation. As with all projects, the architecture must be correct, but the success of the final solution depends on the implementation and the details. Mistakes and oversights in implementation can turn a SAN dream into a storage nightmare, so I hope this column will provide information to help you rest easy.

Design Completion

The final steps in preparing to deploy that shiny new consolidated storage solution involve several crucial decisions. These issues, based on your circumstances, might already be resolved in your design plan, but typically the hard decisions are pushed until the end. If the storage is going into a new facility, many of these questions are moot. However, since most storage is placed in existing environments, I'll assume this is a legacy-integration circumstance.

The first decision involves choosing where data and applications are going to live. Now is a great time to address any compromises your site may have made along the way. Take this opportunity to put elegance and style into your environment by striving for ease of management and enhanced functionality. As a practical example, consider moving all applications and data from internal disks and direct-attached disks to SAN- and NAS-attached disks. Once done, even your largest servers could be considered commodity Field Replaceable Units. For instance, consider a database server that is at performance capacity. If its applications and data were on a SAN, moving the database to a faster machine would no longer be a dreaded task. Rather, you could simply unmount the file systems from the current machine, mount them on the new machine, modify startup files, cronjobs, and system variables, and the upgrade would be done. If you encounter problems, undo the steps and you are back to a working system. Well, it's not quite that simple, but certainly better than the alternatives.

Another method of simplification is to store all shared or shareable files on a NAS server, with all dynamic data or performance-desiring data on SAN storage. For example, store all administration tools on NAS and mount them on all servers. Multiple versions can be kept there, all updates centralized, and all tools will then be available on all machines.

Host-Side Issues

A second decision must be made with respect to host attachment to SAN and NAS. For NAS, the practical question is whether to go with 100-Mb or 1-Gb networking, which is driven by performance requirements and available network switching. Combinations are sometimes the best solution, with some hosts at 100 Mb, some at 1 Gb, and the NAS using multiple 1-Gb connections. From Solaris 8 or Solaris 9 servers, IP multipathing now allows multiple network links to be used in concert between the server and the NAS machine. (For more on IPMP, see the Solaris Companion, November 2001 at http://www.samag.com/documents/sam0111i/.)

The host-side challenges of SAN deployment involve the host bus adapters and attachment resiliency. Typically, the Fibre adapter in the host involves a choice among a few vendors. The storage array and switch vendor will have a say in what is supported. Be sure that all components are supported by at least one of the involved vendors. For example, the array vendor might support the HBA and the switch as well as the array. For service then, one call could resolve any problems in that chain.

Determining what is "best" is more of a challenge. 1-Gb and 2-Gb HBAs for Sun servers are available, and the choice there is as in the NAS -- whether the switch can run at both speeds and auto-detect the correct speed. The functionality of multiple HBAs within a server also requires evaluation. In the best case, multiple instances of HBAs are supported, traffic is load balanced among them (sometimes called "multipathing"), routed to the working ones if one fails, and routed through all of them upon repair. In reality, it is sometimes difficult to get all of these features. For example, an HBA (and its drivers) might do multipathing, but might need a system reboot to restore all links after a failure is repaired.

Software for storage load balancing can be provided with volume management (as with Veritas Volume Manager) or could be provided by the storage array vendor (usually at a cost). Some argue that the former is better because no array-vendor software needs to be installed, maintained, or debugged on the host. Others argue that if the vendor supports the multipathing from the software to the array, then it is worth the cost. Another multipathing alternative is Solaris 8 MPXIO (in release 07/01 and later), known in Solaris 9 as Sun StorEdge Traffic Manager. This solution is kernel based, and knows enough to show a device in the device tree only once, even if the device is seen on multiple paths from the system.

Layered products can also complicate the support issue. For example, is a Sun Cluster with Oracle 9i RAC supported with Traffic Manager? How about Veritas Volume manager with Traffic Manager rather than DMP? For each host, support must be considered from the software stack through the hardware devices.

Transition Considerations

Further complexity arises when the host is connected to external, direct-attached storage. To transition from the old storage to the SAN, the host may need to be connected to both sets of storage. If so, are there enough I/O ports in the machine to allow all of those connections?

Another concern is the actual data copy. Extensive downtime could be required if the file systems are large. Copying raw disk is even more interesting. One of the safest ways to perform these copies is via backup software. A tape backup could be restored, but that method would be inefficient; it would be more effective to perform a disk-to-disk backup/restore (but that ability depends on the backup software in use). If the software supports hot backups, then the copy could be done live, with a final differential or incremental backup/restore done as the final step before going live on the new storage. Of course, a test of the entire procedure, and extensive testing of the new storage before making the commitment, should be a main part of the transition plan.

Probably the best way to transition data between storage devices is to use software mirroring. For example, if the existing system is using Veritas Volume Manager, and the host can accept a concurrent connection to the new storage, then the volume manager can be configured to treat the new storage as a mirror of the old. This "clean" copy will still need testing, but this is a far simpler method than the alternatives.

Another potential transition hazard involves operating system upgrades. An upgrade may be needed to meet support requirements for HBAs, device drivers, multipathing, or other storage components.

If the SAN includes a new backup solution, that transition can be complex as well. Aspects of the new solution to test include full restores, incremental restores, operating system restores, and database restores with database recovery. If hardware third-mirror backups will be performed (e.g., timefinder and its competitors), backup and restore integration must be performed and tested. Arrangements must be made for restoration of old media using the old backup software.

SAN Switching Implementation Complexities

The SAN switch implementation plan should describe exactly what plugs into each port. There are more subtle issues, too. Most storage arrays allow only one type of operating system to speak to each array port. The switch needs to honor this by having only hosts of that type in the switch zone (think of it as a VLAN) for that array port. Thus, Windows NT machines would be in one zone, while Solaris machines would be in another, and HP-UX systems in a third. A fourth zone could be for host-to-tape library communication (in the case of LAN-free backups).

Some sites prefer to use separate switches and HBAs for the backup SAN, for simplicity. Also, common backup solutions do not support automatic failover if multiple paths exist from the host to the tape array. Thus, careful attention should be paid to which path to the tape drives is used. Multipathing will not automatically fail over from one tape to another; however, with proper planning, the loss of one switch may disable access to only half of the tape drives. Manual intervention can also allow easy reconfiguration to regain access to the other drives. Of course, redundant switches should be configured for the storage SAN. Those switches can be connected to each other to form a "fabric", which eases manageability of the SAN (all devices can see all other devices). However, it can reduce availability if a problem on one switch cascades through the link to the other switch.

LUN masking (think of it as IP filtering but, in Fibre Channel, world wide names are used in place of IPs) is a method of assuring that a host sees only the LUNs designated to be mounted on that host. Once again, this problem has multiple solutions. Many HBA drivers can be configured to show the host OS only the intended LUNs, but this requires specially configuring every HBA and giving the responsibility to the host. A more centrally controlled option is to enforce LUN masking in the storage array. The drawback to this option is that, if an HBA is replaced, the configuration will need to be updated on the storage array.

LUN security can be enhanced through switch zoning. Depending on which switch vendor was chosen, you might have the option of hard zones (zones that contain physical switch ports), or soft zones (zones that are made up of WWNs). When switch zoning is used, a given host, if compromised, would not be able to gain access to the storage of other hosts. Switch zoning can also be used to prevent, for example, a Windows system from trying to grab every LUN on every storage array. The system would be in a zone with only its LUNs, unable to see (or grab) LUNs not intended for it.

Individual ports in the switch may need specific settings, based on the kind of host that is communicating with them. Unfortunately, there is no SAN management software that can effectively manage the hosts, switches, and arrays as one unit. Rather, each component must be configured via its own interfaces. Multiple companies and coalitions are working to remedy this, so hope is on the way. In the meantime, painstaking planning and documentation will ensure that all configurations are correct, and that any changes have the desired effect.

For example, for each host, record as appropriate:

The type of path failover
The type of application failover
Switch connections
HBA vendor, model, world wide names, driver version, firmware version, host bus slot of the card, and zone for which it is configured

For each HBA, record as appropriate:

Each of the LUNs assigned, and the LUN details (size, RAID level, etc.)

Disk and Tape Array Implementation Complexities

As for the disk array(s), configuration details include RAID sets, port use, port modes, hot spare locations, and any vendor-specific information pertinent to the array (e.g., LUN type, cache configuration, and quality of service information). Three-way mirroring and remote replication each need configuring and testing. Again, combinations can add to the confusion. For example, in one case, deployment of a replication solution between two sites was delayed for months because the hosts using the LUNS in question were clustered, and the hosts at the secondary site were refusing to mount the replicated LUNs. So, while any given component can be straightforward to design, configure, and test, combining components can cause exponential growth in complexity.

The Challenges of Management and Monitoring

Management tools are currently a challenge because they are vendor and component specific. Monitoring tools do exist that give an overview of the environment and can alarm on specific conditions. For in-depth monitoring, vendor-specific tools are the best way to go. For example, a host-based tool could determine that a file systems' disk space is shrinking rapidly. It would not see the fact that a disk in a RAID set failed and that the hot space is now in use. Fortunately, phone-home capabilities are now becoming common. But the result is an ad hoc set of tools and methods used to monitor and manage the SAN facility. Extensive documentation (as described above), smart administrators, training, and cross-training are the best solutions here.

Conclusions

The issues, complexities, and challenges described in this article are not intended to scare you away from storage consolidation projects. Despite the efforts needed to convert a facility from direct-attached to SAN and NAS-attached storage, the end result is well worth achieving. Vendor independence of storage, the ability to bring new storage where needed at a moment's notice, more resilience and automatic failure recovery, enhanced performance, and cost savings from efficient storage use and decreased day-to-day management efforts are rewards awaiting those who successfully consolidate.

Peter Baer Galvin (http://www.petergalvin.org) is the Chief Technologist for Corporate Technologies (www.cptech.com), a premier systems integrator and VAR. Before that, Peter was the systems manager for Brown University's Computer Science Department. He has written articles for Byte and other magazines, and previously wrote Pete's Wicked World, the security column, and Pete's Super Systems, the systems management column for Unix Insider (http://www.unixinsider.com). Peter is coauthor of the Operating Systems Concepts and Applied Operating Systems Concepts textbooks. As a consultant and trainer, Peter has taught tutorials and given talks on security and systems administration worldwide.