Perl and the Practical Systems Administrator

Randy Appleton

Systems administrators are very busy people. Sometimes a bit of common sense programming can go a long way toward eliminating some of the repetitive tasks associated with systems administration. Perl can be used to automate many recurring tasks. AccountCheck is an example Perl script that automatically checks for problems that can occur when user accounts are made and deleted without every necessary step being completed. Partially created or deleted accounts can be a security hazard, acting as hidden doors to let intruders enter a system. Most systems have such hidden doors. This Perl script checks for all the common types of errors that can occur. It checks the following:

• The home directory exists.

• The account does not share a home directory with anyone else.

• The account owner also owns the files in their home direcory.

• The shell is on an approved list.

• There has been file activity recently.

• The disk space used by the account is reasonable.

• The numeric user ID is not root’s.

• If there is a shadow file, the account is also listed there.

• All mail spool files have an associated user.

• All entries in /etc/shadow also appear in /etc/passwd.

Problem Description

I’m often asked by students to create or delete accounts just before I have to leave to teach. Although I know how to do these things well, like all humans, I sometimes make mistakes. If a sys admin successfully creates and deletes 19 accounts out of 20, he gets 5% wrong. Create and delete 100 accounts per year, and the number of errors becomes large. Even worse, some accounts become dormant. No one told the sys admin that the account is no longer needed, but the person who wanted it has left. In an academic environment, we tend to have many such “orphan accounts”. Every error results in a incorrect entry in /etc/passwd or /etc/shadow, or an unneeded home directory or mail spool file. These things waste disk space, and can act as a way for intruders to enter. Utilities such as adduser and userdel help. Nevertheless, these errors and orphan accounts occur.

Upgrading the system does not help. Typically, we upgrade by installing a distribution onto a new system, and then copying over the old /etc/passwd, /etc/shadow, and home directories. This just copies the old errors onto the new system. Searching for these problems manually is possible, but annoying and time consuming. A better solution is to have an automated tool search for these problems. The Perl language is designed for such tasks, and AccountCheck is a Perl script representing just such a tool.

Perl Program

Perl is the perfect language for checking accounts. The tasks needed to check accounts include parsing the password and shadow files, and checking for the existence of various directories and other files. These are the tasks for which Perl was designed.

Although one might assume the first step is to read /etc/passwd, in practice it works better to read /etc/shadow first. Once /etc/shadow has been read, all of the checks can be done with one pass through /etc/passwd.

The first test is to make sure the file /etc/shadow exists, because not all distributions use shadow passwords. Existence is checked using the -f operator in line 19. If it does exist, /etc/shadow is opened in line 25. The loop starting at line 26 means “While there are lines left in the file, read the next line and it to $line. When the file has been completely read, exit the loop to line 32”. Each entry in /etc/shadow consists of a user name, a password, and some other stuff, all separated by colons. This data is parsed at line 29, and the password for a user is stored in the associative array “password”. The password for “joe” is stored in password{“joe”}. Associative arrays (arrays where the index can be a string) are a wonderful tool, and will make this Perl script much easier to write.

  1:#################################################################
  2:# Change the parameters here
  3:#################################################################
  4:$LOWUID=100;             # don't look at accounts with uid below this
  5:                         # they're special system accounts
  6:$HOME="/home";           # where are the home dirs
  7:$PASS="/etc/passwd";     # where is the password file
  8:$SHADOW="/etc/shadow";   # where is the password shadow file
  9:$MAIL="/var/spool/mail"; # where are the mail spool files
 10:$OLD=30;                 # how many days of no activity before an
 account is old
 11:$BIGACCOUNT=10000;       # How many KB of disk space before an account
 is too big
 12:$MAXDEPTH=0000;          # Max depth of directories to seach for files.
 13:
 14:#################################################################
 15:# Nothing to change below here
 16:#################################################################

 17:#
 18:# Read /etc/shadow if it exists.
 19:if (! -f $SHADOW) {
 20:        print "No shadow password file.  Testing without it.\n";
 21:        $shadow = 0;
 22:}
 23:else {
 24:        $shadow = 1;
 25:        open(SH, $SHADOW) || die "Unable to open password shadow \
                                      file $SHADOW\n";
 26:        while ($line = <SH>) {
 27:               chomp($line);
 28:               local($junk); # make the -w not error the variable junk

 29:               ($name, $password, $junk) = split(":", $line);
 30:               $password{$name} = $password;
 31:       }
 32:}

The next step is to open and read the password file. We don’t have to check for the existence of /etc/passwd, because all distributions have this file. The password file is opened at line 35, and the “while” loop at line 36 reads each entry. Note how the “while” loop at line 36 looks just like the loop at line 26. Program fragments that look the same and have similar purposes are called program phrases, and serve the same purpose in program languages as clichés do in spoken languages. They provide a concise and easily repeated way of expressing a common idea.

One problem with parsing the password file is that sometimes the account includes contact information, and other times it does not. In other words, sometimes the /etc/passwd entry looks like ftp:x:14:50:FTP User:/home/ftp:/bin/bash, and other times it looks like ftp:x:14:50::/home/ftp:/bin/bash. When parsed by Perl’s split function, entries of the first type have the home entry in field #6 and the shell entry in field #7, but the other type of password line has the home and shell entries one field earlier. The “if” statement starting at line 43 corrects errors, setting the $home and $shell as needed.

Finally, the name of the user is recorded in the associative array %users. If $user{"scott"} is equal to one, the user “scott” exists in the password file. Otherwise, he does not. This will be used later to ensure that all mail spool files have an associated user.

 33:# Open and read in the password file
 34:local($name, $password, $uid, $gid, $longname, $home, $shell);
 35:open(PW, $PASS) || die "Unable to open password file $PASS\n";
 36:while ($line = <PW>) {
 37:       chomp($line);
 38:       $shell="";
 39:
 40:       # split the password entry, and correct if they
 41:       # don't have a longname
 42:       ($name, $password, $uid, $gid, $longname, $home, $shell) = \
            split(":", $line);
 43:        if ($shell eq "") {
 44:                $home=$longname;
 45:                $shell=$home;
 46:        }
 47:        $users{$name} = 1;

Now that the data has been read, the various checks must be performed. First, we check to ensure that the user ID corresponds to a valid user. Generally, user IDs below 100 are for general system purposes (ftp, the Web, etc.) and should not be checked for errors. If the account uid is too low, Perl’s next command restarts the “while” loop at line 20, reading a new line from /etc/passwd. User ID and group ID 0 are special. Any account with that user or group ID has full root privileges, even if it’s not named root. A hacker once broke into one of our machines, and created accounts with innocent-sounding names but with a user ID of zero. This hacker did substantial damage. Now we run this script, and all such accounts are flagged for inspection.

 48: # Check some obvious stuff
 49:        if ($uid < $LOWUID) {
 50:                next;
 51:        }
 52:        print "Checking $name\n";
 53:        if ($uid == 0 || $gid == 0) {
 54:                print "    $name has $uid and gid $gid!\n";
 55:         }

The password should be checked. If there is no password at all, then anyone can log in using that account. This is an open door to everyone, and is checked in line 56. If the shadow password file was located, then there should be an entry for this user in /etc/shadow. Line 43 prints an error if both shadow passwords were found ($shadow == 1, set in line 24) and that the particular user lacks an entry (!defined($password{$name}), set in line 40). The “!” operator means “not”.

Interestingly, even in systems with shadow passwords, not all accounts will have shadow passwords. Accounts made by hand, and accounts made before the system was upgraded to use shadow passwords, will store the password in /etc/passwd, not /etc/shadow. Even changing passwords will not cause the new password to be stored in /etc/shadow. Password entries must be moved from /etc/passwd to /etc/shadow manually.

 56:        if ($password eq "") {
 57:                print "    $name has no password!\n";
 58:        }
 59:        if ($shadow == 1 && !defined($password{$name})) {
 60:                print "    $name has no entry in shadow password \
                           file $SHADOW!\n";
 61:        }

Next, the shell should be checked. If the shell is not one of the approved shells (checked at line 62), then an error is printed. Often the shell is not on the approved list because the sys admin disabled the account. Here we disable accounts by changing the shell to /bin/DISABLED. When the user tries to log in, the login process will attempt to run the nonexistent /bin/DISABLED. When that fails, the user will be logged back out immediately. Setting the shell to /bin/DISABLED is sometimes more desirable than just deleting the account because it documents right in the password file that the account did exist, and has been disabled on purpose.

The home directory should also exist. If the directory does not exist (checked with the -d operator, line 66) then an error message is printed. If it does exist, then the program checks to see whether the user shares a home directory with someone else. Someone is already using this directory if the variable $homes{$home} has a value. For example, if $homes{“/home/randy”} has been set, then no one else should use the home directory “/home/randy”. At my school, both our secretaries use the same home directory, since they work on the same shared things. However, such sharing is often an error, and should be flagged for human inspection.

 62:        if ($shell ne "/bin/bash" && $shell ne "/bin/csh" &&
 63:                $shell ne "/bin/tcsh") {
 64:                print "    $name is disabled with shell $shell\n";
 65:        }
 66:        if (! -d $home) {
 67:                print "    $name has home dir $home which is not \
                           really a directory.\n";
 68:        }        else {
 69:                  if (defined($homes{$home})) {
 70:                          print "    $name shares a home dir with \
                                     $homes{$home}.\n";
 71:                  }

At this point, is has been established that the account has a valid home directory. It is also important to establish that the account is in use. One indication of use is that files are being changed. Although there are occasions when an account is used read-only, they are rare. The next segment of code attempts to find a file in the home directory tree of the user that has been modified recently. Of course, what’s recent to one person might be the distant past to another. We deleted an account after 120 days based on inactivity. Later that same day, the user attempted to make her first modification in the past four months and was suprised that her account had been deleted.

The code to examine the home directory looking for recently modified files starts at line 73. It can only be reached if the tests for home directory existence (line 66) and uniqueness (line 69) have been successfully passed. The next bit of code will check every file in the user’s account, and therefore is a convenient place to also check two other problems. The user should be the owner of every file in his account, and the total space used by the account should be reasonable. Although these things could be checked for in separate passes, it is more efficient to check all at once.

Although Perl can certainly read a directory using built-in Perl commands (like opendir, readdir, and stat), it is often easier to read a directory hierarchy using find. Line 83 runs the find process, which produces a line of output for every file to be checked. Each line looks like this: /home/randy/comdex-letter.html-XQC-4-XQC-randy-XQC-949629416. The filename is “/home/randy/comdex-letter.html”, the size is 4 K, the owner is “randy”, and the last modified time is 949,629,416 seconds since the Jan 1, 1970. The “-XQC-” are markers separating the fields, using the assumption that no filename has an “-XQC-” within it. Line 85 reads information from find, and lines 88 and 89 parse it into $filename, $size, $owner, and $date.

 72:           else {
 73:                       #
 74:                        # Scan the users home directory looking for
 75:                        # files owned by someone else, and checking
 76:                        # the amount of space used.
 77:                        #
 78:
 79:                        $homes{$home} = $name;
 80:                        # Checks the total space used by a subdir, \
                              and the access time of
 81:                        # the newest file in that dir.
 82:                        $latest = $filesize = $badowner = 0;
 83:                        open(FIND, "find $home -maxdepth $MAXDEPTH \
                              -printf \"%p-XQC-%k-XQC-%u-XQC-%C@\n\" |")
 84:                                || die "Unable to open find";
 85:                        while ($line = <FIND>) {
 86:                                # get a file and it's info
 87:                                #
 88:                                chomp($line);
 89:                                ($filename, $size, $owner, $date) = \
                                     split("-XQC-", $line);

Now that the output of find has been parsed, the checking can begin. The “if” statement at line 90 records the most recent modified time. After all files have been checked, the most recent time will be compared against the current time (line 104). An error will be printed if this most recently modified file was modified too long ago, since that would mean the account has not been used in a long time.

The owner of the file is compared to the owner of the account in lines 93-100. In an earlier version of this script, every mismatched file was printed, which could be a long list. This version prints an initial message on the first mismatch, and then prints “has more” on the second error, and ignores any further errors. The variable $badowner controls this. Initially, the variable is set to zero (line 82). If the first mismatched file is detected (line 93), an error is printed (line 94), and $badowner is set to one (line 95). If a second mismatched file is later found (line 97), an error is printed (line 98), and $badowner is set to two (line 99). From then on, no more errors of this type will be printed. Such use of a variable to control the number of errors printed is common, and such variables are sometimes called “state variables”.

 90:                                if ($latest < $date) {
 91:                                        $latest = $date;
 92:                                }
 93:                                if ($name ne $owner && $badowner == 0) {
 94:                                        print "    $name has a file \
                                            ($filename) owned by $owner.\n";
 95:                                        $badowner = 1;
 96:                                }
 97:                                if ($name ne $owner && $badowner == 1) {
 98:                                        print "    $name has more \
                                            files not owned by $name.\n";
 99:                                        $badowner = 2;
100:                                }
101:                                $filesize += $size;

The close bracket at line 102 terminates the “while” loop reading from find. At this point, information from all files within the account has been gathered. There are just two more checks needed. The modified time of the most recently modified file is compared to the current time at line 103. The difference is divided by 60 (number of minutes per hour) and 60 again (number of seconds per minute) and 24 (hours per day). Although you could simply divide by 86,400 (60 * 60 * 24), coding the divide in this way helps self-document the code, and leaves clues for later programmers to help understand what’s happening and why. The check at line 107 compares the total file size against the allowed limit, and complains if the account stores too much data.

102:                        }
103:                        $old = (time() - $latest) / 24 / 60 / 60;
104:                        if ($old > $OLD) {
105:                                print "    $name has not changed a \
                                                file in $old days.\n";
106:                        }
107:                        if ($filesize > $BIGACCOUNT) {
108:                                print "    $name uses $filesize KB \

                                           of disk space\n";
109:                        }
110:                }
111:        }
112:}

Conclusion

There are two categories of things to learn from AccountCheck. First, there are several Perl programming techniques, such as leaving self-documenting code (line 103), the use of state variables to help control output (lines 93-100), and the use of external programs to make Perl tasks easier (line 83). The second thing to learn is some systems administration advice. Perl can be both very useful, and very easy for helping catch systems administration tasks. Performing these checks by hand could take forever, and once the tool is written, it can be used many times to catch errors. Developing programming skills can help any systems administrator save time and run a tighter system.

URLs

The source code can be found at http://euclid.nmu.edu/~randy/Survey.html and at the Sys Admin Web site: http://www.sysadminmag.com. One of the best sites for Perl can be found at http://www.perl.org.

About the Author

Randy Appleton is a professor of Computer Science at Northern Michigan University and a big fan of Linux. He can be reached at: randy@euclid.nmu.edu.