Cover V10, I04
Article
Listing 1

apr2001.tar


Automatically Restart Login Services on a Remote Host

Hiu F. Ho

If you manage a server that is offsite, one of the worst things that can happen is that you can no longer log onto it, and you have to spend hours driving to the server in order to fix the problem. A number of problems, ranging from hardware failures to software problems, can keep you out of touch with your remote servers. For software-related problems, however, it may be possible to let the server automatically resuscitate your login service.

I live in Maryland, but I need to manage a FreeBSD Web server in New Jersey. (It's a personal Web server that is co-located at a Web-hosting company's data center.) For security reasons, the only way I can log onto the server is to use Secure Shell (ssh). The problem is, if sshd (Secure Shell daemon) has died for some reason, I will either have to drive three hours to the data center to restart sshd, or call someone at the data center to press the reset button to reboot my FreeBSD box (and who knows what damage that will do to the file system).

UNIX's login services are usually very stable and seldom crash. However, I wrote a small Perl script to prepare for the worst and with the help of the cron utility, I now have a server that can restart sshd if it is killed unintentionally. This works in most cases, except for cases of system or hardware-related problems.

The idea is simple -- a cron job is set up to run the Perl script every few minutes, and the script checks whether sshd is currently running. If it's not, it starts the ssh daemon. This procedure will not solve all login problems, but it is the least you can do to keep the login service alive without purchasing any new hardware and services.

Setting Up

To begin, you must log in as root. Listing 1 shows the Perl script (chk-sshd.pl) I created to start sshd whenever necessary. There are two things to specify in the script. First, you need to specify the name of the login process. To find out the process name of your login daemon, start the login service (if it's not already started), then use ps to list all the processes that are currently running and look for the name:

$ ps ax
PID TT STAT  TIME COMMAND
...
162 ?? Is    0:00.99 moused -p /dev/psm0 -t auto
202 ?? Is    0:46.00 /usr/local/sbin/sshd
238 ?? S     0:02.46 /usr/interbase/bin/gds_lock_mgr
...
Because I'm running sshd as my only login service, I copy the command section of the sshd line (/usr/local/sbin/sshd) and put it into my Perl script's line #7.

Next, specify the path to the login daemon that you want the script to start in line #13 of the script. In my case, I can use the which command to get the full path to sshd:

$ which sshd
/usr/local/sbin/sshd
Leave the rest of the script as is (unless you know what you're doing).

The first thing the script does is to get a list of the currently running processes (line #19). Then it examines the list to see if sshd is currently running (line #24 - #28). The script will start the sshd if the process doesn't exist in the list (line #34 - #36).

After you save the script, remember to chmod and chown the script file so it can only be read, written, and run by root.

$ chmod 700 chk-sshd.pl
$ chown root chk-sshd.pl
After creating the script, you need to add a new cron job for root. If a crontab file already exists, you can simply add the following line to the file. If you don't already have a crontab file, simply create a file and name it whatever you want (e.g., mycrontab) and add the following line to the file:

0,5,10,15,20,25,30,35,40,45,50,55 * * * * /root/chk-sshd.pl
You may need to change the path to your Perl script at the end of the above line. After adding the line to your crontab file, rerun crontab with the crontab file you just modified:

$ crontab -u root mycrontab
Now you should have a diehard login service running on your server.

Testing

The simplest way to test whether it work is to kill your login service, so in my case:

$ killall -9 sshd
then wait several minutes and check if your login service is being restarted. (Don't do this on a remote server unless you're certain everything is set up correctly.) In my case, (five minutes after I killed sshd):

$ ps ax | grep sshd
54441 ?? Ss 0:00.70 /usr/local/sbin/sshd
It's alive, again!

Summary

After completing the above steps, you should have a server that is able to resuscitate its login service after the login daemon dies. Keep in mind that this won't solve all the login problems you might encounter, but it's the least you can do to recover the login service without any additional hardware or human support.

Hiu Ho is a Senior Software Engineer at Netword, and the creator of the Netword Agent for Linux. Hiu Ho can be contacted at: hiu@netword.com.