High-Availability
File Server with heartbeat
Steve Blackmon and John Nguyen
Maintaining maximum system uptime is becoming increasingly critical
to the success of any organization. While there are many off-the-shelf
solutions for high availability, they are often very expensive and
require expertise that smaller companies do not have on staff. In
this article, we present a much lower cost alternative to achieving
high-availability (HA) services using inexpensive hardware and publicly
available software. A systems administrator can learn to use and
maintain our system with minimal time investment. We will provide
step-by-step procedures for building a high-availability file server
for UNIX and Windows clients. Although the article focuses on how
to set up a file server, the technique could be applied to any number
of services.
Hardware and Software Components
Hardware
To get started, you will need two systems with at least one network
interface each (preferably two), an available serial port, a SCSI
controller, and an external SCSI hard drive.
We used two identically configured Intel ISP 1100 servers with
650-MHz Pentium III processors and 128 MB of RAM. These systems
each have two integrated 10/100 Ethernet interfaces and are rack-mountable
(1U). Each system has two internal IDE drives, which we used for
the OS installation. For our shared disk, we used an external 9-GB
SCSI drive that is attached to both systems. Our SCSI controllers
are Adaptec AHA-2940AUs (see Figure 1).
Software
We used Red Hat Linux 6.2 (kernel 2.2.14-5.0), Samba version 2.0.6-9
(included with RH 6.2), and heartbeat version 0.4.9-1 (available
from: http://www.linux-ha.org/download).
heartbeat is a publicly available package written by Alan Robertson.
heartbeat provides the basic functions required by any HA system
such as starting and stopping resources, monitoring the availability
of the systems in the cluster, and transferring ownership of a shared
IP address between nodes in the cluster. heartbeat is a software
solution that monitors the health of a particular service (or services)
through either a serial line or Ethernet interface or both. The
current version supports a 2-node configuration whereby special
heartbeat "pings" (broadcast/multicast messages) are used
to check the status and availability of a service. It is a vital
component of the whole Linux-HA package.
Although heartbeat is currently only available for Linux, the
next release will include support for Solaris, FreeBSD, and OpenBSD.
We are grateful to Alan for contributing this useful software
and also for his input while we were writing this article.
Procedure
This is the procedure we used with our hardware. You may need
to adapt this procedure depending on you situation and/or hardware
available to you.
Hook Up the Equipment
Figure 1 shows the connectivity required for this cluster.
- Connect a null modem cable between the serial ports on each
system.
- Connect a Cat 5 crossover cable between your second ethernet
interfaces on each system.
- Connect a SCSI cable from each system to your external SCSI
disk.
Change the SCSI ID on Your Primary System
With SCSI, every component on the bus must have a unique ID, including
the host adapter cards that normally have a default ID of 7. Because
our SCSI bus will have two host adapters and a disk, we need to
change the ID of one of the adapters. We changed the SCSI ID of
our primary system to 6 and left the ID of the secondary system
at 7. The ID of the adapter must be changed from the SCSI BIOS.
With Adaptec SCSI controllers, you typically get into the BIOS configuration
screen by pressing <control>A when prompted during the boot
process. If you are using some other host adapter, you will need
to refer to your manual to figure out how to change the SCSI ID
of your adapter.
Install the Operating System on the Primary System
We called our systems "ttisrv1" and "ttisrv2";
ttisrv1 is our primary system and ttisrv2 is the secondary. You
will want to give the primary Ethernet interface of each system
a unique public address. You also need to configure your secondary
Ethernet interfaces with IP addresses, but you can pick any unique
subnet because these interfaces will be private. If you do a custom
installation, don't forget to include Samba.
Set Up the External Disk
You will need to partition and create a filesystem on your external
disk. Note that this is only necessary on your primary system. We
used a single partition that contained the entire disk.
Create partition with fdisk:
ttisrv1 # fdisk /dev/sda
The number of cylinders for this disk is set to 1116. There is nothing
wrong with that, but this is larger than 1024, and could in certain
setups cause problems with:
1. Software that runs at boot time (e.g., LILO)
2. Booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-1116, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-1116, default 1116):
Using default value 1116
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
WARNING: If you have created or modified any DOS 6.x
partitions, please see the fdisk manual page for additional
information.
Now we create a filesystem on the external SCSI disk:
ttisrv1# mkfs /dev/sda1
Create a mount point for the disk:
ttisrv1# mkdir /ttidisk
Make sure you can mount the filesystem:
ttisrv1# mount /dev/sda1 /ttidisk
Create a directory to hold the Samba password file:
ttisrv1# mkdir /ttidisk/smb
Create a public directory for file sharing and set the proper permissions:
ttisrv1# mkdir /ttidisk/public
ttisrv1# chmod 1775 /ttidisk/public
Unmount the drive (heartbeat will mount it for you):
ttisrv1# umount /ttidisk
Download and Install heartbeat
After downloading heartbeat, install it on the primary system:
ttisrv1# rpm -ivh <download_path>/heartbeat-0.4.9-1.i386.rpm
Configure heartbeat
The procedures for configuring heartbeat are well documented and
you can find examples along with the documentation in /usr/share/doc/packages/heartbeat.
We show you the specific configuration we used for our implementation;
if you require more information, please refer to the documentation
in the abovementioned directory.
There are three files that you will need to set up to get heartbeat
working: authkeys, ha.cf, and haresources.
Configure /etc/ha.d/authkeys
This file sets your authentication keys for the cluster that must
be the same on both nodes. You can choose from three authentication
schemes: crc, md5, or sha1, depending on your
security needs. We chose md5. Here is our /etc/ha.d/authkeys
file:
# use md5 with key "ttikey"
auth 3
3 md5 ttikey
The authkeys file must only be readable by root or heartbeat
will not start. Be sure to set the appropriate permissions after creating
this file:
# chmod 600 /etc/ha.d/authkeys
Configure /etc/ha.d/ha.cf
This file defines the nodes in the cluster and the interfaces that
heartbeat uses to verify whether or not a system is up. Here is our
/etc/ha.d/ha.cf file:
# define nodes in cluster
node ttisrv1
node ttisrv2
# time a system must be unreachable before considered dead (seconds)
deadtime 5
# set up for the serial heartbeat pulse
serial /dev/ttyS0
baud 19200
# interface to run the network heartbeat pulse
udp eth1
Configure /etc/ha.d/haresources
This file describes the resources that are managed by heartbeat. The
resources are basically just start/stop scripts much like the ones
used for starting and stopping resources in /etc/rc.d/init.d.
Note that heartbeat will look in /etc/rc.d/init.d and /etc/ha.d/resource.d
for scripts. Here is our /etc/ha.d/haresources file:
# use ttisrv1 as primary, use 192.168.0.100 as shared IP
ttisrv1 192.168.0.100 Filesystem::/dev/sda1::/ttidisk::ext2 smb nfslock nfs
This line tells heartbeat to start these resources on ttisrv1 with
the shared IP address of 192.168.0.100. It also tells heartbeat to
mount the filesystem found on /dev/sda1 at the /ttidisk
mount point and to start Samba and NFS.
Configure /etc/hosts
Your /etc/hosts file should contain entries for both of
the nodes in your cluster and your shared IP address. Here is an
except from out /etc/hosts file on ttisrv1 (each system
should have the same entries):
127.0.0.1 ttisrv1 localhost.localdomain localhost
192.168.0.99 ttisrv2
192.168.0.100 ttisrv
Configure Samba
Samba version 2.0.6 is included with Red Hat Linux 6.2. If you
did not select the option to include it when you installed Red Hat
on your machine, you can install it from the installation CD using
RPM:
ttisrv1# rpm -Uvh /mnt/cdrom/RedHat/RPMS/samba-*.rpm
Substitute /mnt/cdrom with the mount point of the CD drive
on your machine if it is different.
Configure smb.conf
Check to see if Samba is already running. Type:
ttisrv1# /etc/rc.d/init.d/smb status
If Samba is not active you will see:
smbd is stopped
nmbd is stopped
If Samba is running, you will see:
smbd (pid 5103) is running...
nmbd (pid 5114 5112) is running...
Note: the PIDs might be different for your host. If Samba is running,
stop it:
ttisrv1# /etc/rc.d/init.d/smb stop
It really doesn't matter whether Samba is running or not. Modifications
that we make to its configuration file will be picked up by Samba
because, by default, it checks that file every 60 seconds for changes.
For consistency's sake, we would like Samba to come up after
we are completely done with our configuration. Remember to save the
original copy in case you have to go back to it. Locate the file /etc/smb.conf
(or /etc/samba/smb.conf), and let's get to work.
Installing Samba is a simple process. You either select it as
an option during your Red Hat installation or install the package
later using RPM. Configuring it is another matter. We won't
go into a Samba configuration tutorial here. However, Using Samba
from O'Reilly & Associates delves into this subject, and
we recommend that you refer to this book for further understanding.
With that said, Listing 1 shows what we've done with our
Linux host to turn it into a Samba server. We basically started
out with the default smb.conf file that came with our system
and modified it to suit our needs. Please note that this is not
the complete smb.conf file; it only lists the options that
we actually changed. You can leave the other options as they were
in the original file.
Get Samba Ready
After you've finished making changes to the smb.conf,
you can check to make sure that your configuration file is set up
correctly and free of errors:
ttisrv1# /usr/bin/testparm -s
If there is no error, you are ready to go. However, don't start
the Samba server at this time. The heartbeat program is set up to
start it automatically.
Next, add a user to your Samba password file. This user should
already have a valid account on your Linux machine (i.e., the account
is present in your /etc/passwd file). If not, Samba will
refuse to add the user. For our example, we will add the user steve.
Mount the filesystem:
ttisrv1# mount /dev/sda1 /ttidisk
Add the user:
ttisrv1# /usr/bin/smbpasswd -a steve
Samba will prompt you for the new SMB password; enter it accordingly.
Unmount the drive (heartbeat will mount it for you):
ttisrv1# umount /ttidisk
Please note that the mounting and unmounting of /ttidisk is
totally unnecessary once the system is up and running with heartbeat
in control of the shared filesystem. This is just a demonstration
to provide you with a starting point for your Samba server.
Configure NFS
Make sure that you don't start NFS on boot up:
ttisrv1# /sbin/chkconfig --del nfs
Then add the links to kill NFS on shutdown or reboot:
ttisrv1# /sbin/chkconfig --level 016 nfs off
These steps are necessary because we want heartbeat to control the
startup of NFS.
Add the following to /etc/exports. Create the file if it
doesn't already exist:
# Export the shared disk, allowing read/write access and
# synchronous I/O with no write delay.
/ttidisk 192.168.0.*(rw,sync,no_wdelay)
Test heartbeat
Start heartbeat on the primary system with the following command:
ttisrv1# /etc/rc.d/init.d/heartbeat start
Starting High-Availability services: [ OK ]
If it fails, look in /var/log/messages to determine the reason and then correct it. After heartbeat starts successfully, you should see a new interface with the IP address that you configured in the ha.cf file. This interface is an alias, so you will see it displayed like the following:
ttisrv1# ifconfig
<...clipped output...>
eth0:0 Link encap:Ethernet HWaddr 00:D0:B7:00:B5:09
inet addr:192.168.0.100 Bcast:192.168.0.255 \
Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:7 Base address:0x7000
Note: You might experience a short delay while heartbeat attempts to bring up the interface. You should also see that the disk has been mounted:
ttisrv1# df -k
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/hda1 2016016 42612 1870992 2% /
/dev/hda6 11385128 376292 10430500 3% /usr
/dev/sda1 8823404 552 8374644 0% /ttidisk
Also check to see that Samba and NFS started successfully:
ttisrv1# /etc/rc.d/init.d/smb status
smbd (pid 5729) is running...
nmbd (pid 5740 5738) is running...
ttisrv1# /etc/rc.d/init.d/nfs status
rpc.mountd (pid 5825) is running...
nfsd (pid 5841 5840 5839 5838 5837 5836 5835 5834) is running...
rpc.rquotad (pid 5816) is running...
Configure the Secondary System
The secondary system can be configured with a subset of the steps used to configure the primary. The steps for configuring the secondary system are very similar to the primary.
First, install the operating system as outlined in the section labeled Install the Operating System on the Primary System. Next, download and install heartbeat per the section labeled Download and Install heartbeat. We recommend that you copy all the configuration files created on the primary system to the secondary system. You could manually go through the process of creating them again but it is time consuming and prone to error. You can get all the files you need in a tar archive with the following command:
ttisrv1# cd /
ttisrv1# tar cvf <path>/hafiles.tar /etc/smb.conf \
/etc/ha.d/haresources /etc/ha.d/ha.cf /etc/ha.d/authkeys \
/etc/exports
After creating the tar file, transfer it to the secondary system using FTP or whatever method you prefer (you could mount the shared disk on the primary, put the tar file on it, umount it, and then mount it on the secondary system for the extraction), and extract the configuration files:
ttisrv2# cd /
ttisrv2# tar xvf <path>/hafiles.tar
Remember to add entries for the primary system and shared IP address into your /etc/hosts file.
Test Connectivity
At this point both systems should be ready to go. There are a few tests you should try to make sure everything is in order before testing failover. First, make sure you can ping one system from the other on both interfaces. You also need to test that your serial connection is functional.
On ttisrv1:
ttisrv1# cat < /dev/ttyS0
On ttisrv2:
ttisrv2# echo "TTY test" > /dev/ttyS0
You should see the text on ttisrv1. You should also reverse the test to make sure you have bi-directional communication.
We will now test the ability to mount the disk on the secondary system. First, create the mount point:
ttisrv2# mkdir /ttidisk
Make sure you can mount the filesystem:
ttisrv2# mount /dev/sda1 /ttidisk
Unmount the disk:
ttisrv2# umount /ttidisk
Start heartbeat on Secondary System
If you've made it this far with no problems you are in excellent shape! All we have left to do is start heartbeat on the secondary system, and then we can test to make sure it all works.
Start heartbeat on secondary system:
ttisrv2# /etc/rc.d/init.d/heartbeat start
Testing Failover
You can test failover by simply stopping heartbeat on the primary system:
ttisrv1 # /etc/rc.d/init.d/heartbeat stop
You should see all the services come up on the second machine in 30 seconds or less. If you do not, look in /var/log/messages to determine the problem and correct it. You can fail back over to the primary by starting heartbeat again. heartbeat will always give preference to the primary system and will start to run there if possible.
Caveats
In the configuration we have described, we used only one disk, which is a single point of failure. It would be preferable to either use a hardware RAID device or two disks with mirroring, which we felt was beyond the scope of this article. We have tested this configuration with a hardware RAID device with no anomalies noted. We have not tried to use software RAID 1 (mirroring), although we think it would work fine. We would also like to point out that any pending disk writes during a failover could fail depending on the precise timing, but a second attempt would work fine.
It is possible to corrupt your disk if both of the systems attempt to mount the filesystem read/write at the same time. This condition is known as "split brain", and you must take every precaution to ensure it does not happen. If both systems were to mount the same filesystem read/write, they would both attempt to keep the superblock synchronized without regard for changes being made on the other system. This will result in data corruption. The best way to reduce this risk is to have multiple heartbeat interfaces, so heartbeat can determine the status of the other system in the cluster. If you use only a single Ethernet interface and that interface fails, heartbeat will assume the system is down and attempt to take over the disk. By using an Ethernet and a serial heartbeat interface, it would take two distinct failures before split brain could occur.
Other Uses
There are endless possibilities for how you may use heartbeat to provide high-availability services. It is particularly good for providing Web services and read-only file access. For example, if you have a number of CDs that you would like to make available to your users, you could purchase a SCSI device with multiple CDs and share them all via Samba and NFS.
Summary
We have provided a way to set up a very useful and highly available file server using inexpensive hardware and software that are free, readily available, and relatively easy to set up. You are only limited by your imagination as to how you can expand this sample system to include other components to meet your needs. We hope that we have piqued your interest enough for you to get started on your own high availability project. Feel free to send us your comments.
Steve Blackmon cofounded Transparent Technologies, Inc. in 1999. He has been a Software Developer and System Administrator for 14 years. He currently provides consulting expertise in the areas of high-availability, SAN, and IT infrastructure to high-profile clients in the Atlanta area. He can be reached at: steve.blackmon@transtech.cc.
John Nguyen has a B.S. in Computer Engineering from Florida Institute of Technology, Melbourne, Florida. He is an application developer with 14 years of experience. His interests are computers, politics, and classical literature. He can be reached at john.nguyen@acm.org.
|