Providing
Network Services Using LVS
Malcolm Cowe
The Linux Virtual Server (LVS) project provides a scalable server
solution built upon a loosely connected collection of individual
computers organized into a cluster. The primary components are the
director and a collection of "realservers". The director
hosts the interface to the client network, and handles all incoming
requests, delegating connections to one of the realservers in the
cluster. The actual architecture of the cluster remains opaque to
the clients; clients do not need to be aware of the underlying implementation
or topology of the cluster, and they see only a single interface
to the resources provided by the cluster.
The cluster is designed to be highly available, including the
director, which can be installed with a standby node that monitors
for failover conditions. Failover and fault tolerance in the director
will not be covered in this article, although monitoring of the
realservers is addressed.
Background
At Agilent, the client network consists largely of PCs running
Windows NT. However, the manufacturing systems are UNIX workstations
running HP-UX, and a good number of the engineering tools and data
analysis applications employed are also based on the HP-UX platform.
As an aside, the majority of the server infrastructure at the site
comprises HP-UX equipment, which means that the engineers and software
developers require access to UNIX systems in the execution of their
day-to-day tasks.
In the past, users were furnished with UNIX workstations at their
desks, often alongside a PC. Unfortunately, workstations are expensive
and the cost of upgrading to newer models, as well as component
costs, prompted consolidation of the client infrastructure to a
single PC on each desk. Users were thus isolated from the UNIX applications
they needed, so X servers were installed on the PC systems, allowing
users to start a UNIX session on a remote machine.
This still left a problem -- what do the users then connect
to? They need consistent and reliable access to an HP-UX workstation
environment. The environment should try to encompass all the traits
of a highly available server, while also providing solutions to
the problem of overloading and congestion. There are several solutions
to this problem, but I decided to deploy a Linux Virtual Service
cluster.
Initial Proposal
Based on past experience and feedback from users, I came up with
the following basic criteria for any centralized X server environment.
The system must:
- Run HP-UX, and must use workstations because some applications
are not supported on HP servers.
- Have a single point of access. Users are given a single hostname
to which to connect.
- Be robust and reduce single points of failure.
- Perform well under load.
- Be cost-effective (i.e., pretty much free).
Previous solutions involved a pool of workstations nominally earmarked
for use as login servers. Users were given a list of the hosts and
they just picked one. A Web page displayed the number of connections
and average load on each machine, so that users could make an informed
choice as to which system to log into. Unfortunately, users only
referred to the WWW page once, and then never returned. In time,
the load became increasingly imbalanced between the systems --
some workstations would have between 20-40 connections while others
had none.
After some research, I came across the Linux Virtual Service project,
hosted at http://www.linuxvirtualserver.org. LVS clusters
comprise a director node that acts as an intelligent load-balancing
switch and a collection of back-end realservers that carry out the
actual work. The whole cluster is presented to the network as a
single entity through a virtual IP address, so users only see one
server. Crucially, only the scheduling node needs to run Linux;
the back-end server pool can be any platform with a TCP/IP stack.
In fact, the pool can be a mixture of platforms, although this is
not necessarily a desirable feature from a support point of view.
Overview of the Agilent Pilot
The plan was to construct a pilot LVS cluster using the simplest
topology: a single network, one-interface cluster using Network
Address Translation (NAT). This means that the director and realservers
in the cluster are kept on the same subnet as the client systems,
and that there is only one network interface on the director (often
the realservers are kept on a private subnet, and the director,
with two interfaces, acts as a bridge between networks). This layout
has the advantage of simplicity. It is straightforward to deploy
this style of LVS cluster on any network, and it requires very little
configuration.
Figure 1 shows the functional topology of the network in which
the pilot LVS cluster was deployed. The cluster is composed of the
director and six realservers. Each realserver is running HP-UX 10.20,
while the director is running Linux.
The director publishes a virtual IP address (VIP), which the client
uses to connect to the cluster. The function of the director is
to forward and balance specified services on this VIP to the individual
realservers. In the pilot setup, it was decided that the cluster
should provide the telnet and XDMCP services, allowing users to
run full XDM sessions over the network, or to invoke an xterm by
initiating a session over telnet (a feature provided by most X server
systems on Windows). All other service connections to the director
not under control of the LVS (e.g., to VIP:ftp) will be delivered
locally on the director.
For this topology to work, all requests sent to the cluster from
the client must pass through the director; to this end, the director
publishes a virtual IP address (VIP), which is used for all client
connections to the cluster. Additionally, all replies from the realservers
to clients must pass through the director so that the director masquerades
all VIP traffic and passes the client request to one of the realservers
to process. When the realserver responds, the reply packet must
be de-masqueraded before being sent back to the client. As a result,
each realserver in the cluster must route all traffic through the
director. In short, all traffic to and from the LVS cluster must
pass through the director in order for the Network Address Translation
topology to work.
Creating the Director
At the heart of any LVS cluster is the scheduling node, referred
to as the director. This is the system where most of the installation
effort must be concentrated.
Installing Linux on the Director
Red Hat Linux version 7.1 was used as the basis for the construction
of the Agilent LVS director. There is no dependency within LVS on
any particular distribution and both the 2.2 and 2.4 series kernels
are supported. But, bear in mind that there may be small differences
in file system layout between the different vendors of Linux distributions.
These differences are no doubt reflected in the examples accompanying
this text.
Where possible, it is best to start with a fresh install of the
basic operating system. The director does not need to do anything
other than direct traffic, and so can be pared down to the minimum
required for operation within your network environment. Generally
speaking, choosing the most basic "Server" install from
Red Hat's configuration choices provides a suitable basis for
LVS.
Download Software
In addition to the base Linux distribution, it is necessary to
acquire the following packages:
- A recent release of the Linux kernel source, version 2.4.4
or higher. The examples use version 2.4.9, obtained from http://kernel.org.
- The LVS software, available from:
http://www.linuxvirtualserver.org/software
Get the latest stable release of the IPVS Netfilter module for kernel
2.4. The Agilent pilot uses version 0.8.2. You do not need to download
the patches or any other sources from this page.
- Startup and maintenance scripts. (See Listings 1-3 at:
http://www.sysadminmag.com/code/
Install the Kernel Source
Extract the kernel sources and establish the directory structure
under /usr/src. You must remove or override any soft links
in that directory that refer to Linux. The following instructions
should create the correct environment:
# cd /usr/src
# mv linux linux.orig
# bunzip2 -c linux-2.4.9.tar.bz2 | tar xvf -
(alternatively: tar zxvf linux-2.4.9.tar.gz, if the gzip
compressed source tree was acquired)
# mv linux linux-2.4.9
# ln -snf linux-2.4.9 linux
# ln -snf linux-2.4.9 linux-2.4
Extract the LVS Software Distribution and Apply the Kernel Patch
# cd /tmp
# tar zxvf ipvs-0.8.2.tar.gz # cd /usr/src/linux # cat
# /tmp/ipvs-0.8.2/linux_kernel_ksyms_c.diff | patch -p1
patching file kernel/ksyms.c
Hunk #1 succeeded at 264 (offset 11 lines).
Build a New Kernel with IPVS Support
First, make a small change to the kernel configuration. I usually
grab an existing kernel configuration file and execute make oldconfig
before carrying out these steps.
1. Run the menu-based kernel configuration application:
# make menuconfig
2. Select Networking options --->.
3. Select IP: Netfilter Configuration --->.
4. Be sure that the following options are not selected for inclusion
in the kernel, either directly or as modules:
ipchains (2.2-style) support
ipfwadm (2.0-style) support
This step does not have anything to do with LVS itself. However, IPchains
and IPfwadm are legacy firewall modules maintained in the kernel for
backwards compatibility only. Both modules are incompatible with IPtables,
the current kernel firewall implementation, and it is recommended
that users migrate to the current system and disable these modules.
5. Configure the rest of the kernel options as normal, then exit
and save the changes.
Next, build the new kernel and the modules:
# make dep
# make clean
# make bzImage
# make modules
# make modules_install
Install the New Kernel Into the Boot Partition
The following process assumes a Red Hat Linux file system layout.
# cp /usr/src/linux-2.4.9/arch/i386/boot/bzImage /boot/vmlinuz-2.4.9
# cp /usr/src/linux-2.4.9/System.map /boot/System.map-2.4.9
# cd /boot
# ln -snf System.map-2.4.9 System.map
# ln -snf vmlinuz-2.4.9 vmlinuz
# mkinitrd /boot/initrd-2.4.9.img 2.4.9
Edit the Boot Loader Configuration
Red Hat uses lilo in the 7.1 distribution. If you use a different
boot loader, then the process will be different:
1. Edit /etc/lilo.conf.
2. Copy the "image" section for the current Linux kernel.
3. Change the image tag to reflect the new kernel path (e.g.,
change image=/boot/vmlinuz-2.4.2-2 to image=/boot/vmlinuz-2.4.9).
4. Change the label to linux-lvs.
5. If there is an initrd line in the image section, alter
it to reflect the new init ram disk for the new kernel (e.g., initrd=/boot/initrd-2.4.9.img).
The init ram disk is normally only required on systems with a SCSI
boot disk, and is used to pre-load any drivers required to access
hardware appropriate to the boot process.
6. Near the top of the file, change the "default" variable
to linux-lvs (e.g., default=linux-lvs).
Write your changes and exit the editor. Run the lilo command
to activate the changes, then reboot the system:
# /sbin/lilo
Added linux
Added linux-lvs *
# reboot
Install the IPVS Modules and Tools
Change to the directory where the LVS IP Virtual Service software
was extracted, and build the software. Building the IPVS kernel
modules will generate a number of compiler warnings. This appears
to be normal, so ignore them for now.
To build and install IPVS on the LVS Director:
# cd /tmp/ipvs-0.8.2/ipvs
# make
# make -C ipvsadm
# make -C ipvsadm install
# mkdir -p /lib/modules/2.4.9/kernel/net/lvs
# cp *.o /lib/modules/2.4.9/kernel/net/lvs
# depmod -a
There will be two errors reported by depmod when it is run. It is
safe to ignore these. Two of the files copied across are not kernel
modules and don't actually need to be copied. They don't
do any harm though, and I find it easier just to copy the lot. The
errors, therefore, are simply due to sloppy administration and are
not shortcomings in the LVS code.
LVS Director Configuration
Having established the basis upon which to build virtual services,
it is now time to configure the LVS environment, beginning with
the director. A shell script has been developed in conjunction with
this documentation to ease the administrative overhead of preparing
the director for use within a single interface, NAT cluster topology.
While not as comprehensive in scope as the configure script from
the LVS project, it aims to be easier to install and maintain by
making certain assumptions on behalf of the administrator.
Install the LVS Director Startup Script
Extract the LVS startup and maintenance scripts, and then install
the director control script (Listing 1) and its configuration file
(Listing 2):
# cd /tmp
# tar xvf lvs-control.tar
# cp lvs.conf /etc
# cp lvs-director.sh
# /etc/rc.d/init.d/lvs-director
# chmod 755 /etc/rc.d/init.d/lvs-director
# cd /etc/rc.d/rc3.d
# ln -s ../init.d/lvs-director S99lvs-director
Edit the Configuration
Edit /etc/lvs.conf to reflect your environment. The file
contains a good deal of information to guide you through this process.
A summary of the options follows:
VIP -- Virtual IP address of the LVS service. This
is the host name or IP address that clients use to connect to the
service (e.g., VIP=epsg9008).
VIP_IF -- Virtual IP network interface. The VIP must
be associated with a network interface to be available to the network.
When there is only one physical NIC, create an alias for this network
interface. Aliases are represented using this syntax eth<x>:<Number>.
For example, eth0:110 would be a valid alias for the Ethernet
interface eth0 (i.e., VIP_IF=eth0:110).
LVS_IF -- This is the primary network interface used
by the LVS director. Since the setup currently only supports one
network (one interface NAT topology), there should be little requirement
to change the default value of eth0. It is only used to turn
off ICMP redirects on the interfaces that LVS uses (i.e., LVS_IF="eth0").
TPORTS -- A list of the TCP-based services that the
LVS provides to the network. Each service is delimited by white
space, and may be a port number or a "human-readable"
label corresponding to an entry in the services file (look in /etc/services
for a list). Do not forget to wrap the list in quotes (e.g., TPORTS="telnet").
UPORTS -- A list of UDP-based services provided by
the LVS. The same rules apply to this list as for TPORTS (i.e.,
UPORTS="xdmcp").
RIPS -- Realserver IP addresses. This is the list of
realservers that make up the back room of the LVS cluster. The director
node determines which realserver will respond to a client request,
while the realserver carries out the actual tasks. Again, the list
is whitespace delimited. For example:
RIPS="epsg9010 \
epsg9011 \
epsg9012 \
epsg9013 \
epsg9014 \
epsg9015 \
"
How to Use the Startup Script
The director startup script, lvs-director, has a number of command-line
options to help with the running and configuration of the cluster.
This script can be used for nearly all administrative tasks related
to the cluster itself, and is not simply restricted to being a startup/shutdown
script for the system's boot process. The script takes all
its configuration information from the file /etc/lvs.conf.
lvs-director currently supports the following command-line arguments:
start -- This flag starts the LVS service if it is
not already running. First, the existing virtual service table is
initialized to make way for the new configuration. Second, the network
interface for the cluster's Virtual IP address is established.
Third, the LVS services are added to the director.
stop -- This flag shuts down the LVS services and unloads
the network interface of the Virtual IP address.
reload -- The reload command will apply any
changes made to the cluster configuration in /etc/lvs/conf,
which is done by clearing the LVS table and reloading it with the
information in the configuration file. The network interface for
the virtual service is not affected by this command.
status -- Returns the current status of the LVS cluster
as output by ipvsadm -L.
update -- This is really a polling command. Each realserver
defined in /etc/lvs.conf is polled to see whether it is still
available to the network. If the machine does not respond to a ping
request, it is removed from the LVS table. Conversely, if the machine
does respond to the request and it was previously removed from the
cluster (or it is a new node), the machine is added into the cluster
configuration. By adding this command into cron, it is possible
to establish a monitoring service for each realserver in the cluster.
Of course, it does not monitor the services themselves, only basic
network connectivity. Nevertheless, it is a reasonable mechanism
for automating the cluster configuration based on server availability.
In this way, if a realserver crashes or otherwise becomes unavailable,
it can be removed from the cluster automatically until it is repaired
and returned to service, reducing the risk of a client trying to
connect to a realserver that has demonstrated a failure.
How to Set Up the Realservers
By comparison to the setup of the director node in the cluster,
the realservers are relatively easy to configure. However, you must
be comfortable with the network setup administrative tasks associated
with the realserver nodes to complete this part of the cluster configuration.
The following sections describe how to complete the LVS cluster
installation process, with an aside on the provision of XDMCP services
to the client network.
Realserver Routing Setup
To ensure that the routing tables are correctly set up for the
cluster, each realserver must be configured to use the LVS director
as the default route. This is normally carried out at OS installation
time, but there are tools available on most OS releases that allow
the administrator to make the changes later. For example, on Red
Hat a tool called setup can be invoked, and on HP-UX the
network configuration is handled by set_parms. If you prefer
to edit the configuration files directly, Red Hat stores this information
in /etc/sysconfig, in the file network and the sub-directory
network-scripts, while HP-UX keeps this information in /etc/rc.config.d/netconf.
It is also essential that no other network routes be configured
for the LVS Ethernet interface on the realserver. To do this, download
the script (Listing 3) to the realserver.
LVS Route Alteration
This script removes the subnet route from the network interface,
leaving only the default route. This ensures that the cluster uses
the LVS Director for all network traffic and does not try to respond
to client requests using a direct route, which is a prerequisite
of a Network Address Translation setup. (See the LVS HOWTO for a
detailed description of the routing issues with LVS-NAT). Install
the lvs routing script on each realserver in the LVS cluster. Do
not install this script on the director. Currently supported platforms
are HP-UX and Linux, although this is readily extensible to incorporate
other systems.
On HP-UX:
# cp lvs-route /sbin/init.d
# chmod 755 /sbin/init.d/lvs-route
# cd /sbin/rc3.d
# ln -sf /sbin/init.d/lvs-route S200lvs-route
On Linux:
# cp lvs-route /etc/rc.d/init.d
# chmod 755 /etc/rc.d/init.d/lvs-route
# cd /etc/rc.d/rc3.d
# ln -snf ../init.d/lvs-route S99lvs-route
# cd /etc/rc.d/rc5.d
# ln -snf ../init.d/lvs-route S99lvs-route
It may be necessary to alter the NET_INTERFACE parameter in
this script to point to the correct Ethernet interface. In most cases,
"0" is the correct default (corresponding to "eth0"
or "lan0"), but it may need to be changed. To activate the
routing changes on the realserver, execute the script or reboot the
system.
Special Note Concerning X over LVS
When using LVS to load balance XDMCP sessions (e.g., Reflection
X clients), it is necessary to ensure that each realserver is running
a login service capable of handling the requests. On HP-UX, this
means running the CDE desktop login on each realserver, and on Linux
this means running XDM or one of its derivatives (e.g., GDM or KDM).
On Red Hat Linux, setting the default run-level to 5 (or GUI Login)
will launch the Display Manager at system startup.
In terms of installation, therefore, the LVS realservers should
be treated as though they are standard desktop workstations to ensure
that the correct software and services are installed on the system.
On Linux, the X11 system must be installed and configured in order
to run the Display Manager. On Red Hat this means choosing either
a Workstation install or a custom install, which will allow more
fine-tuning.
Malcolm Cowe is currently employed by Sun Microsystems as a
Senior Test Engineer. Prior to that he was a systems administrator,
working at Agilent Technologies in the Technical Computing Team.
|