Reliable
Network with SolarisTM
Peter Baer Galvin
Until recently, it was very difficult to configure a Solaris machine
to have redundant connections to a network, and to use them automatically
in case of a failure. Because of the magic of Solaris 8, the task
is now easy. If you are not IP Multipathing yet, you should be.
The Problem
Consider a Solaris host on a network. By default, it expects one
network connection per subnet to which it is being attached. If
it sees the same subnet on more than one interface, then one interface
is used for all outbound packets, and any interface can be used
by inbound packets (based on their destination addresses). Unfortunately,
if the one outbound interface fails, then traffic is outbound no
more.
Until recently, there were two standard methods to solve this
problem. One was to buckle down and write scripts that would ping
a device (say the default gateway). If the ping failed, the script
could configure another interface on that subnet to handle the traffic.
Of course, scripts must be debugged, supported, and updated, annoying
their authors.
Alternatively, the Ethernet Trunking "Sun Consulting Special"
could be purchased from Sun Professional Services. This set of scripts
basically did the above work for you. Of course it cost money, and
was only somewhat supported by Sun.
The problem is exacerbated by Sun servers' roles in a variety
of different architectures. One example is shown in Figure 1, which
shows a standard three-tier architecture, as might be found at a
Web site. Most firewalls and load balancer clusters automatically
manage their IP addresses during a failover. They always make an
IP address available to the tier "above" them. Likewise,
a database cluster has its IP addresses managed by the cluster software,
which moves IPs between cluster servers as needed. The only component
in this environment that does not provide such functionality is
the Sun servers. Should a network cable, a switch port, or host
bus adapter in the communications channel between the Sun and its
network go bad, the Sun will be unavailable. Although the facility
will continue to function by using the redundant server, user connections
may need to be reestablished, state could be lost, and performance
could be negatively affected.
The Solution
IP multipathing was introduced in Solaris 8 release 10/00. It
has quite a bit of useful functionality when two or more interfaces
are attached to the same subnet. These interfaces are called a "group".
The functionality includes:
- Automatic load balancing of outgoing connections over the group.
The load balancing is on a per-TCP-session basis.
- Failover of a main IP address to an alternative interface,
should the primary fail to reach a remote destination.
- Active-active or active-standby failover modes (the other interfaces
can be active or just used in case the primary fails). Active-active
is recommended.
- Automatic fail-back once the failed component is fixed.
- Built in to Solaris with full Sun support and no cost.
IP multipathing does have some limitations and requirements as
well:
- Each interface in the group must have a unique MAC address.
- Each adapter in the group must be of the same type (token ring,
Ethernet, etc.).
- The interfaces on the same subnet must have an associated group
name.
- Each interface in the group must have a test address.
- Each interface should have a data address (for application
use).
- Alternate pathing (AP) and IP multipathing appear to be mutually
exclusive.
- It is best to use separate I/O boards or host bus adapters
for each interface in a group (rather than two interfaces from
the same QFE adapter, for example).
IP multipathing functionality is implemented by a new daemon,
in.mpathd. For failure detection and correction to work,
each interface must be configured with an additional IP alias to
be used as a test address. This second IP address must be on the
same network as the real IP address. The test address may not be
used by applications. When in.mpathd is started, it periodically
sends out an ICMP ECHO request ("ping") to the host's
gateway from each test address. If the gateway doesn't respond,
it sends a multicast to find any other host. If one or more hosts
reply, one is chosen as a new ping target for testing. If no reply
is received, in.mpathd assumes the link is down and moves
all addresses on the interface, but the test address, to another
interface in the group. It continues to try the interface to recover
from failure once the problem is repaired, and returns the addresses
to their original interface when that happens.
IP Multipathing Implementation
To enable IP multipathing, several changes must be made to the
system. These can either be done at the command line, or as shown
here, in configuration files. For changes to be made permanent,
they must be made in configuration files.
First, make sure that each interface has a unique MAC address
by issuing the command:
setenv local-mac-address? true
at the "ok" prompt, or the command:
eeprom "local-mac-address?=true"
as root. A reboot is needed for this change to take effect, but make
the other configuration changes first to avoid repeated reboots.
Next, modify /etc/hosts to contain the additional network
addresses. It is a good practice to create an address for each interface
(the normal and dummy addresses). For IP multipathing, the system
needs test addresses for each interface as well. For host wb-1:
# External-facing interfaces
10.1.3.101 wb-1-e
10.1.3.102 wb-1-e-dummy
10.1.3.111 wb-1-e-qfe0
10.1.3.112 wb-1-e-qfe2
# Internal-facing-interfaces
10.1.4.1 wb-1-i
10.1.4.2 wb-1-i-dummy
10.1.4.11 wb-1-i-qfe1
10.1.4.12 wb-1-i-qfe3
Likewise, for wb-2:
# External-facing interfaces
10.1.3.151 wb-2-e
10.1.3.152 wb-2-e-dummy
10.1.3.161 wb-2-e-qfe0
10.1.3.162 wb-2-e-qfe2
# Internal-facing-interfaces
10.1.4.51 wb-2-i
10.1.4.52 wb-2-i-dummy
10.1.4.61 wb-2-i-qfe1
10.1.4.62 wb-2-i-qfe3
The "group" name tells the system which interfaces are on
the same subnet, thus allowing proper test and result analysis.
Now update the /etc/hosts files to reflect the new network
names and enable IP multipathing on each set of interfaces. On wb-1:
/etc/hostname.qfe0:
wb-1-e-qfe0 netmask + broadcast + group wb-1-e deprecated -failover \
addif wb-1-e netmask + broadcast + failover up
/etc/hostname.qfe2:
wb-1-e-qfe2 netmask + broadcast + group wb-1-e deprecated -failover \
addif wb-1-e-dummy netmask + broadcast + failover up
Similarly, edit the files for qfe1 and qfe3. The "deprecated"
flag prevents the use of these addresses by applications, as they
are only for failover. The "failover" allows the system
to recover if an interface failure is detected.
Finally, a reboot of the system will enable IP multipathing. ifconfig
-a can be used to check the status of the interface groups and
their states. Any status changes are also reported via the syslog
mechanism. For example, a link or interface failure, as well as
the failover to correct the problem, will be reported there. Once
IP multipathing is up and running, the system will load balance
outbound connections across each interface in a group. in.mpathd
will monitor the interfaces to determine if there is a failure,
will fail traffic over to other group members on failure, and will
re-enable traffic once the failed interface is again functioning.
This functionality allows Solaris servers to be perfect members
of highly available facilities, such as the one depicted in Figure
1.
Further configuration of failover parameters can be done in /etc/default/mpathd.
Here the fault detection and failover time can be changed from its
default ten seconds. Note that the system sends ICMP ECHO request
at a rate of ten times the failover time, so decreasing the failover
time will increase the amount of network traffic being sent to detect
a failure.
An interesting use of IP multipathing can occur on a system with
only one interface per subnet. Here, the same configuration steps
can be performed. Although no resilience is provided, network, interface,
or host bus adapter failures are more accurately detected and reported.
One problem with IP multipathing occurs on the Solaris 8 releases
10/00 and 04/01 when the router discovery daemon is in use (in.rdisc).
Extraneous ICMP errors occur when an interface in an IP Multipathing
group is pinged.
For full details on IP multipathing in general, and the ping problem
in specific, see the very good "IP Network Multipathing"
Sun Blueprints Online document from August 2001, available at:
http://www.sun.com/blueprints/online.html
The best full reference for IP multipathing is in the documentation
set, entitled "IP Network Multipathing Administration Guide"
(available at docs.sun.com).
Thanks to Frank Corrao from Corporate Technologies for contributing
to this column.
Peter Baer Galvin (http://www.petergalvin.org)
is the Chief Technologist for Corporate Technologies (www.cptech.com),
a premier systems integrator and VAR. Before that, Peter was the
systems manager for Brown University's Computer Science Department.
He has written articles for Byte and other magazines, and
previously wrote Pete's Wicked World, the security column,
and Pete's Super Systems, the systems management column for
Unix Insider (http://www.unixinsider.com).
Peter is coauthor of the Operating Systems Concepts and Applied
Operating Systems Concepts textbooks. As a consultant and trainer,
Peter has taught tutorials and given talks on security and systems
administration worldwide.
|