A Homegrown Backup Solution Utilizing RSA Keys, SSH, and tar

Ray Strubinger

Our hardware upgrade project included the task of carrying out backups for several new servers, however, budget constraints and space limitations made placing tape drives in each rack-mounted server impossible. To solve the problem, our organization chose a Tandberg 35/70 DLT drive for use with the machine that would serve as the tape backup server and used shell scripts to perform the backups.

Each server ran FreeBSD 4.x, so a variety of utilities were available for a custom-tailored backup solution. Physically, the servers that needed to be backed up were in close proximity to each other and were networked with a switch. The machine serving as the tape server was a machine recently liberated by the hardware upgrade and was merely re-tasked to serve as a platform for backing up and restoring data. The tape server contained two SCSI controllers with one controller used exclusively for the tape drive (the other controller handled the disks). Although IDE disks in removable trays could be used as a storage media for backups made with the tape server, hard disks were not considered to be durable enough for our purposes. Tape technology is proven and, in general, we considered tape media to be more durable than hard disks. Adding removable disks to each server was not possible because the servers are rack-mount units that simply lack the space for additional hard disks.

The use of a dedicated machine for the tape drive provided a convenient way to do restores without the worry of harming a production server. Each server in our environment replicates its file systems, so a restore of data to a temporary area on a production system would result in the replication of that data to other machines unless replication was turned off prior to the restore. Reserving an area on the disk that is not replicated is somewhat limiting in our environment but may be a viable option in some situations.

backupserver.sh and backupclient.sh

There are two scripts that do the work, and some changes must be made before the scripts can be used. The first script, backupserver.sh, resides on the tape server. It specifies which machines will be backed up as well as recording how long the backup took to complete. See Listing 1. (All listings for this article are available at http://www.sysadminmag.com/code/.)

This script also provides a way to detect and recover from an end-of-tape condition, since the script appends backups to the tape to allow data to be kept for a longer period of time. The script is typically invoked from cron, and it allows the tape server to transparently log in to each machine to be backed up, and execute the second script, backupclient.sh. The installation location of each script is not critical as long as the paths are set correctly within the scripts. The example scripts that follow back up three machines, two of which are tape clients named "liberty" and "eagle". The third machine is the tape server named "tape".

The transparent login is accomplished with RSA authentication over SSH so the secure shell daemon on the backup server and the backup clients will need to be configured to support RSA authentication. This can be done by adding the lines PermitRootLogin without-password and RSAAuthentication yes in the sshd_config file then restarting the secure shell daemon. If tighter security is required, it is possible to create RSA keys that can be used only for the purpose of backups. Consult the man page on sshd for more information on that technique.

Creating RSA keys is done with ssh-keygen. The ssh-keygen application will create a public key in $HOME/.ssh/identity.pub, where $HOME will probably be root unless it is decided to establish another user for the purpose of running backups. If a user other than root is established for running backups, that user must be able to write to the tape device. In the case of the machine with the tape drive, the contents of the identity.pub file should be copied to $HOME/.ssh/authorized_keys on each machine that is to serve as a backup client. This will enable the tape server to log in to each tape client. ssh-keygen must be run on each tape client (each machine to be backed up) and the resulting identity.pub must be copied to the authorized_keys file on the tape server. This will enable each tape client to connect back to the tape server and write to the tape drive.

The backupserver.sh script also uses fping to determine which machines are alive. fping is a ping-like utility that uses Internet Control Message Protocol (ICMP) and is meant to be used in scripts. The primary difference between ping and fping is that fping will send out a packet then move on to the next host on its list instead of waiting for a timeout or a reply from the host it just pinged. The -a option of fping will only show systems that are alive. If the tape client does not respond to fping, no attempt will be made to back up that machine. Although the script does not send out a notification that a host was unreachable, it would be easy to provide such notification with the following lines inserted before the initial use of fping in the backupserver script:

# If the backup client is unreachable, notify someone
for j in '$fping_location -u $backup_client'; do
        echo Unable to reach $j while attempting the backup | mail -s "Unreachable Host: $j" $notify_me
done

Finally, a symlink from /dev/nrsa0 to /dev/tape should be created. It is possible to edit the references to /dev/tape in the scripts or set the TAPE environment variable on all the machines and adjust the script to handle those changes. Consult the man page on tar for details.

In most cases, specifying the machines and directories to back up and who to notify of events on backup status are the only changes required. The paths to utilities such as tar, dd, ssh, or the backupclient script may have to be changed within the backup scripts if the locations of those utilities do not correspond to their locations in the local environment. Although the script was intended to be called primarily from cron, it can be invoked manually, and the user will receive feedback on the various stages of the backup process.

Optionally, backupclient.sh (Listing 2) could be added to the end of backupserver.sh to create a list of the files that were written to the tape. This index can be useful since it will not require the tape to be reloaded and re-read once the tape has been removed from the tape drive. In the example, three machines have been backed up -- liberty, eagle, and tape (the tape server) -- and Listing 2 reflects the three machines.

The second script, backupclient.sh resides on each machine that should be backed up that does not have a tape drive. Backupclient.sh is called by the backupserver.sh from the machine with the tape drive. As with backupserver.sh, backupclient.sh contains a list of parameters that can be customized for your environment. These parameters allow you to specify which directories are to be backed up, paths to various utilities, and a person to notify of the backup's status. Once called by backupserver.sh, backupclient.sh will connect to the tape server and write its data to the tape drive. Each script is well commented and most of the parameters that require modifications are found at the beginning of each script.

These scripts have been used for the past year and a half and evolved as our needs have changed. Both scripts are simple to follow easy to maintain. The various utilities available to an administrator of a typical UNIX installation often makes the creation of small specific applications worthwhile in the smaller UNIX shops.

Ray Strubinger has been a network administrator on various platforms (UNIX, NT, and Netware) for more than five years in the e-commerce and financial sectors. He is currently the UNIX administrator for an electronic bidding service and can be reached at: rays@infotechfl.com.