Article
Figure 1
Listing 1
Listing 2
Listing 3

sep2002.tar

Providing Network Services Using LVS

Malcolm Cowe

The Linux Virtual Server (LVS) project provides a scalable server solution built upon a loosely connected collection of individual computers organized into a cluster. The primary components are the director and a collection of "realservers". The director hosts the interface to the client network, and handles all incoming requests, delegating connections to one of the realservers in the cluster. The actual architecture of the cluster remains opaque to the clients; clients do not need to be aware of the underlying implementation or topology of the cluster, and they see only a single interface to the resources provided by the cluster.

The cluster is designed to be highly available, including the director, which can be installed with a standby node that monitors for failover conditions. Failover and fault tolerance in the director will not be covered in this article, although monitoring of the realservers is addressed.

Background

At Agilent, the client network consists largely of PCs running Windows NT. However, the manufacturing systems are UNIX workstations running HP-UX, and a good number of the engineering tools and data analysis applications employed are also based on the HP-UX platform. As an aside, the majority of the server infrastructure at the site comprises HP-UX equipment, which means that the engineers and software developers require access to UNIX systems in the execution of their day-to-day tasks.

In the past, users were furnished with UNIX workstations at their desks, often alongside a PC. Unfortunately, workstations are expensive and the cost of upgrading to newer models, as well as component costs, prompted consolidation of the client infrastructure to a single PC on each desk. Users were thus isolated from the UNIX applications they needed, so X servers were installed on the PC systems, allowing users to start a UNIX session on a remote machine.

This still left a problem -- what do the users then connect to? They need consistent and reliable access to an HP-UX workstation environment. The environment should try to encompass all the traits of a highly available server, while also providing solutions to the problem of overloading and congestion. There are several solutions to this problem, but I decided to deploy a Linux Virtual Service cluster.

Initial Proposal

Based on past experience and feedback from users, I came up with the following basic criteria for any centralized X server environment. The system must:

Run HP-UX, and must use workstations because some applications are not supported on HP servers.
Have a single point of access. Users are given a single hostname to which to connect.
Be robust and reduce single points of failure.
Perform well under load.
Be cost-effective (i.e., pretty much free).

Previous solutions involved a pool of workstations nominally earmarked for use as login servers. Users were given a list of the hosts and they just picked one. A Web page displayed the number of connections and average load on each machine, so that users could make an informed choice as to which system to log into. Unfortunately, users only referred to the WWW page once, and then never returned. In time, the load became increasingly imbalanced between the systems -- some workstations would have between 20-40 connections while others had none.

After some research, I came across the Linux Virtual Service project, hosted at http://www.linuxvirtualserver.org. LVS clusters comprise a director node that acts as an intelligent load-balancing switch and a collection of back-end realservers that carry out the actual work. The whole cluster is presented to the network as a single entity through a virtual IP address, so users only see one server. Crucially, only the scheduling node needs to run Linux; the back-end server pool can be any platform with a TCP/IP stack. In fact, the pool can be a mixture of platforms, although this is not necessarily a desirable feature from a support point of view.

Overview of the Agilent Pilot

The plan was to construct a pilot LVS cluster using the simplest topology: a single network, one-interface cluster using Network Address Translation (NAT). This means that the director and realservers in the cluster are kept on the same subnet as the client systems, and that there is only one network interface on the director (often the realservers are kept on a private subnet, and the director, with two interfaces, acts as a bridge between networks). This layout has the advantage of simplicity. It is straightforward to deploy this style of LVS cluster on any network, and it requires very little configuration.

Figure 1 shows the functional topology of the network in which the pilot LVS cluster was deployed. The cluster is composed of the director and six realservers. Each realserver is running HP-UX 10.20, while the director is running Linux.

The director publishes a virtual IP address (VIP), which the client uses to connect to the cluster. The function of the director is to forward and balance specified services on this VIP to the individual realservers. In the pilot setup, it was decided that the cluster should provide the telnet and XDMCP services, allowing users to run full XDM sessions over the network, or to invoke an xterm by initiating a session over telnet (a feature provided by most X server systems on Windows). All other service connections to the director not under control of the LVS (e.g., to VIP:ftp) will be delivered locally on the director.

For this topology to work, all requests sent to the cluster from the client must pass through the director; to this end, the director publishes a virtual IP address (VIP), which is used for all client connections to the cluster. Additionally, all replies from the realservers to clients must pass through the director so that the director masquerades all VIP traffic and passes the client request to one of the realservers to process. When the realserver responds, the reply packet must be de-masqueraded before being sent back to the client. As a result, each realserver in the cluster must route all traffic through the director. In short, all traffic to and from the LVS cluster must pass through the director in order for the Network Address Translation topology to work.

Creating the Director

At the heart of any LVS cluster is the scheduling node, referred to as the director. This is the system where most of the installation effort must be concentrated.

Installing Linux on the Director

Red Hat Linux version 7.1 was used as the basis for the construction of the Agilent LVS director. There is no dependency within LVS on any particular distribution and both the 2.2 and 2.4 series kernels are supported. But, bear in mind that there may be small differences in file system layout between the different vendors of Linux distributions. These differences are no doubt reflected in the examples accompanying this text.

Where possible, it is best to start with a fresh install of the basic operating system. The director does not need to do anything other than direct traffic, and so can be pared down to the minimum required for operation within your network environment. Generally speaking, choosing the most basic "Server" install from Red Hat's configuration choices provides a suitable basis for LVS.

Download Software

In addition to the base Linux distribution, it is necessary to acquire the following packages:

A recent release of the Linux kernel source, version 2.4.4 or higher. The examples use version 2.4.9, obtained from http://kernel.org.
The LVS software, available from:

http://www.linuxvirtualserver.org/software

Get the latest stable release of the IPVS Netfilter module for kernel 2.4. The Agilent pilot uses version 0.8.2. You do not need to download the patches or any other sources from this page.

Startup and maintenance scripts. (See Listings 1-3 at:

http://www.sysadminmag.com/code/

Install the Kernel Source

Extract the kernel sources and establish the directory structure under /usr/src. You must remove or override any soft links in that directory that refer to Linux. The following instructions should create the correct environment:

# cd /usr/src
# mv linux linux.orig
# bunzip2 -c linux-2.4.9.tar.bz2 | tar xvf -
(alternatively: tar zxvf linux-2.4.9.tar.gz, if the gzip 
 compressed source tree was acquired)
# mv linux linux-2.4.9
# ln -snf linux-2.4.9 linux
# ln -snf linux-2.4.9 linux-2.4

Extract the LVS Software Distribution and Apply the Kernel Patch
# cd /tmp
# tar zxvf ipvs-0.8.2.tar.gz # cd /usr/src/linux # cat 
# /tmp/ipvs-0.8.2/linux_kernel_ksyms_c.diff | patch -p1
patching file kernel/ksyms.c
Hunk #1 succeeded at 264 (offset 11 lines).

Build a New Kernel with IPVS Support

First, make a small change to the kernel configuration. I usually grab an existing kernel configuration file and execute make oldconfig before carrying out these steps.

1. Run the menu-based kernel configuration application:

# make menuconfig

2. Select Networking options --->.

3. Select IP: Netfilter Configuration --->.

4. Be sure that the following options are not selected for inclusion in the kernel, either directly or as modules:

ipchains (2.2-style) support
ipfwadm (2.0-style) support

This step does not have anything to do with LVS itself. However, IPchains and IPfwadm are legacy firewall modules maintained in the kernel for backwards compatibility only. Both modules are incompatible with IPtables, the current kernel firewall implementation, and it is recommended that users migrate to the current system and disable these modules.

5. Configure the rest of the kernel options as normal, then exit and save the changes.

Next, build the new kernel and the modules:

# make dep
# make clean
# make bzImage
# make modules
# make modules_install

Install the New Kernel Into the Boot Partition

The following process assumes a Red Hat Linux file system layout.

# cp /usr/src/linux-2.4.9/arch/i386/boot/bzImage /boot/vmlinuz-2.4.9
# cp /usr/src/linux-2.4.9/System.map /boot/System.map-2.4.9
# cd /boot
# ln -snf System.map-2.4.9 System.map
# ln -snf vmlinuz-2.4.9 vmlinuz
# mkinitrd /boot/initrd-2.4.9.img 2.4.9

Edit the Boot Loader Configuration

Red Hat uses lilo in the 7.1 distribution. If you use a different boot loader, then the process will be different:

1. Edit /etc/lilo.conf.

2. Copy the "image" section for the current Linux kernel.

3. Change the image tag to reflect the new kernel path (e.g., change image=/boot/vmlinuz-2.4.2-2 to image=/boot/vmlinuz-2.4.9).

4. Change the label to linux-lvs.

5. If there is an initrd line in the image section, alter it to reflect the new init ram disk for the new kernel (e.g., initrd=/boot/initrd-2.4.9.img). The init ram disk is normally only required on systems with a SCSI boot disk, and is used to pre-load any drivers required to access hardware appropriate to the boot process.

6. Near the top of the file, change the "default" variable to linux-lvs (e.g., default=linux-lvs).

Write your changes and exit the editor. Run the lilo command to activate the changes, then reboot the system:

# /sbin/lilo
Added linux
Added linux-lvs *
# reboot

Install the IPVS Modules and Tools

Change to the directory where the LVS IP Virtual Service software was extracted, and build the software. Building the IPVS kernel modules will generate a number of compiler warnings. This appears to be normal, so ignore them for now.

To build and install IPVS on the LVS Director:

# cd /tmp/ipvs-0.8.2/ipvs
# make
# make -C ipvsadm
# make -C ipvsadm install
# mkdir -p /lib/modules/2.4.9/kernel/net/lvs
# cp *.o /lib/modules/2.4.9/kernel/net/lvs
# depmod -a

There will be two errors reported by depmod when it is run. It is safe to ignore these. Two of the files copied across are not kernel modules and don't actually need to be copied. They don't do any harm though, and I find it easier just to copy the lot. The errors, therefore, are simply due to sloppy administration and are not shortcomings in the LVS code.

LVS Director Configuration

Having established the basis upon which to build virtual services, it is now time to configure the LVS environment, beginning with the director. A shell script has been developed in conjunction with this documentation to ease the administrative overhead of preparing the director for use within a single interface, NAT cluster topology. While not as comprehensive in scope as the configure script from the LVS project, it aims to be easier to install and maintain by making certain assumptions on behalf of the administrator.

Install the LVS Director Startup Script

Extract the LVS startup and maintenance scripts, and then install the director control script (Listing 1) and its configuration file (Listing 2):

# cd /tmp
# tar xvf lvs-control.tar
# cp lvs.conf /etc
# cp lvs-director.sh 
# /etc/rc.d/init.d/lvs-director
# chmod 755 /etc/rc.d/init.d/lvs-director
# cd /etc/rc.d/rc3.d
# ln -s ../init.d/lvs-director S99lvs-director

Edit the Configuration

Edit /etc/lvs.conf to reflect your environment. The file contains a good deal of information to guide you through this process. A summary of the options follows:

VIP -- Virtual IP address of the LVS service. This is the host name or IP address that clients use to connect to the service (e.g., VIP=epsg9008).

VIP_IF -- Virtual IP network interface. The VIP must be associated with a network interface to be available to the network. When there is only one physical NIC, create an alias for this network interface. Aliases are represented using this syntax eth<x>:<Number>. For example, eth0:110 would be a valid alias for the Ethernet interface eth0 (i.e., VIP_IF=eth0:110).

LVS_IF -- This is the primary network interface used by the LVS director. Since the setup currently only supports one network (one interface NAT topology), there should be little requirement to change the default value of eth0. It is only used to turn off ICMP redirects on the interfaces that LVS uses (i.e., LVS_IF="eth0").

TPORTS -- A list of the TCP-based services that the LVS provides to the network. Each service is delimited by white space, and may be a port number or a "human-readable" label corresponding to an entry in the services file (look in /etc/services for a list). Do not forget to wrap the list in quotes (e.g., TPORTS="telnet").

UPORTS -- A list of UDP-based services provided by the LVS. The same rules apply to this list as for TPORTS (i.e., UPORTS="xdmcp").

RIPS -- Realserver IP addresses. This is the list of realservers that make up the back room of the LVS cluster. The director node determines which realserver will respond to a client request, while the realserver carries out the actual tasks. Again, the list is whitespace delimited. For example:

RIPS="epsg9010 \
epsg9011 \
epsg9012 \
epsg9013 \
epsg9014 \
epsg9015 \
"

How to Use the Startup Script

The director startup script, lvs-director, has a number of command-line options to help with the running and configuration of the cluster. This script can be used for nearly all administrative tasks related to the cluster itself, and is not simply restricted to being a startup/shutdown script for the system's boot process. The script takes all its configuration information from the file /etc/lvs.conf. lvs-director currently supports the following command-line arguments:

start -- This flag starts the LVS service if it is not already running. First, the existing virtual service table is initialized to make way for the new configuration. Second, the network interface for the cluster's Virtual IP address is established. Third, the LVS services are added to the director.

stop -- This flag shuts down the LVS services and unloads the network interface of the Virtual IP address.

reload -- The reload command will apply any changes made to the cluster configuration in /etc/lvs/conf, which is done by clearing the LVS table and reloading it with the information in the configuration file. The network interface for the virtual service is not affected by this command.

status -- Returns the current status of the LVS cluster as output by ipvsadm -L.

update -- This is really a polling command. Each realserver defined in /etc/lvs.conf is polled to see whether it is still available to the network. If the machine does not respond to a ping request, it is removed from the LVS table. Conversely, if the machine does respond to the request and it was previously removed from the cluster (or it is a new node), the machine is added into the cluster configuration. By adding this command into cron, it is possible to establish a monitoring service for each realserver in the cluster. Of course, it does not monitor the services themselves, only basic network connectivity. Nevertheless, it is a reasonable mechanism for automating the cluster configuration based on server availability. In this way, if a realserver crashes or otherwise becomes unavailable, it can be removed from the cluster automatically until it is repaired and returned to service, reducing the risk of a client trying to connect to a realserver that has demonstrated a failure.

How to Set Up the Realservers

By comparison to the setup of the director node in the cluster, the realservers are relatively easy to configure. However, you must be comfortable with the network setup administrative tasks associated with the realserver nodes to complete this part of the cluster configuration. The following sections describe how to complete the LVS cluster installation process, with an aside on the provision of XDMCP services to the client network.

Realserver Routing Setup

To ensure that the routing tables are correctly set up for the cluster, each realserver must be configured to use the LVS director as the default route. This is normally carried out at OS installation time, but there are tools available on most OS releases that allow the administrator to make the changes later. For example, on Red Hat a tool called setup can be invoked, and on HP-UX the network configuration is handled by set_parms. If you prefer to edit the configuration files directly, Red Hat stores this information in /etc/sysconfig, in the file network and the sub-directory network-scripts, while HP-UX keeps this information in /etc/rc.config.d/netconf.

It is also essential that no other network routes be configured for the LVS Ethernet interface on the realserver. To do this, download the script (Listing 3) to the realserver.

LVS Route Alteration

This script removes the subnet route from the network interface, leaving only the default route. This ensures that the cluster uses the LVS Director for all network traffic and does not try to respond to client requests using a direct route, which is a prerequisite of a Network Address Translation setup. (See the LVS HOWTO for a detailed description of the routing issues with LVS-NAT). Install the lvs routing script on each realserver in the LVS cluster. Do not install this script on the director. Currently supported platforms are HP-UX and Linux, although this is readily extensible to incorporate other systems.

On HP-UX:

# cp lvs-route /sbin/init.d
# chmod 755 /sbin/init.d/lvs-route
# cd /sbin/rc3.d
# ln -sf /sbin/init.d/lvs-route S200lvs-route

On Linux:

# cp lvs-route /etc/rc.d/init.d
# chmod 755 /etc/rc.d/init.d/lvs-route
# cd /etc/rc.d/rc3.d
# ln -snf ../init.d/lvs-route S99lvs-route
# cd /etc/rc.d/rc5.d
# ln -snf ../init.d/lvs-route S99lvs-route

It may be necessary to alter the NET_INTERFACE parameter in this script to point to the correct Ethernet interface. In most cases, "0" is the correct default (corresponding to "eth0" or "lan0"), but it may need to be changed. To activate the routing changes on the realserver, execute the script or reboot the system.

Special Note Concerning X over LVS

When using LVS to load balance XDMCP sessions (e.g., Reflection X clients), it is necessary to ensure that each realserver is running a login service capable of handling the requests. On HP-UX, this means running the CDE desktop login on each realserver, and on Linux this means running XDM or one of its derivatives (e.g., GDM or KDM). On Red Hat Linux, setting the default run-level to 5 (or GUI Login) will launch the Display Manager at system startup.

In terms of installation, therefore, the LVS realservers should be treated as though they are standard desktop workstations to ensure that the correct software and services are installed on the system. On Linux, the X11 system must be installed and configured in order to run the Display Manager. On Red Hat this means choosing either a Workstation install or a custom install, which will allow more fine-tuning.

Malcolm Cowe is currently employed by Sun Microsystems as a Senior Test Engineer. Prior to that he was a systems administrator, working at Agilent Technologies in the Technical Computing Team.