Cover V03, I05
Article
Listing 1
Listing 2
Listing 3
Listing 4

sep94.tar


Checking User Security

Larry Reznick

While preparing for a UNIX security audit recently, I examined several user account issues that many of us may overlook. These issues include duplicate home directories, users idle for longer than many days, old accounts that haven't been used for a long time, and inactive or infrequent password aging. Left unattended, some of these issues may, over time, turn into big security problems. The key to preventing minor security breaches from becoming big problems is to learn about them soon after they happen.

Simple shell scripts running once a day, once a week, or once a month can catch these problems, and I've written a set of scripts that serve this purpose. One of my objectives in writing these scripts was to deliver the information without requiring root access. Comprehensive checking scripts that only root can run are not useful to department managers who could keep an eye on system issues but can't find out what they need to know because they can't run root jobs. System queries such as those used in these shell scripts don't require root access, so system administrators can offload these administrative responsibilities to group managers or department managers. Furthermore, because root needn't run these scripts, they can run under some other cron job that isn't as cluttered as root's.

Duplicate Home Directories

Generic login accounts are a bad idea. They seem convenient from the administrator's view: when many people need to use a program, and the only thing they do is run that program, why create dozens of individual accounts? Instead, why not create one generic account that everyone can login to? The problem with this approach is the lack of accountability. If the user's login name is your only way of knowing who is doing what, you won't know much with everybody using the same login name. tty and pty identifiers won't tell you who is really using the terminal. Is the person typically using that station really at some other station?

If the software asks for the user's identification, you might think that a generic login account will be sufficient because the program will keep track of who is doing what. Does that program notify UNIX of this user-specific information? Without such notification, can a UNIX administrator know what's happening? Does the program allow the user shell access? If so, that user is still not identified, yet can perform system operations without accountability.

Solve these problems with separate accounts. Give each user a separate login name and password. Give each login a separate home directory. Some system administrators are tempted to create a generic home directory and have every user working with that application login there. A generic home directory, however, makes file isolation problematic. If the application creates user-specific files, are those files identified uniquely for each user? If not, but they're put in the current directory or a central directory, how will you know which file belongs to which user? When a user has shell access or configuration facilities within the application, one user can interfere with another user's files or operation.

Make home directories private to each user. In a secure environment, home directories should not be writeable by any user other than the owner. Stringent security environments shouldn't allow home directories that are readable by any user other than the owner. When several people have ownership access to one home directory, these security precautions are useless.

A simple shell script, chkduphome (Listing 1), examines the /etc/passwd(4) file. It reports the names of any accounts sharing the same home directory. Field six of the colon-delimited /etc/passwd file holds the login account's home directory. cut(1) extracts the directories and sort puts them in order so that duplicates become visible. sort(1) has a "-u" option to do the work of uniq(1). uniq's "-c" option does something helpful that sort's "-u" option doesn't do: it shows the duplicate count. Every entry passed through uniq -c will have a count -- even the single entries. egrep(1) cuts out all entries with a count of 1. Spaces before and after the 1 ensure that only 1 is eliminated, not 10 or 21. Only duplicate entries remain, despite the number of duplicates found. Resulting entries are passed on to sed(1).

The script then applies two sed expressions to the entries representing duplicates. The first expression deletes the count part of the record, leaving only the home directory name. The second expression puts the colons back into the directory name. These actions prepare the name for another egrep search through the password file. egrep will collect the account names associated with those duplicate home directories. If the account names were collected in the initial cut, sort could have ordered them strictly by the home directory key, but uniq couldn't have counted the duplicates so easily.

Records are duplicates from uniq's view only if the entire records match. Some versions of uniq allow field specifications -- the version I used did not. sort won't emit only the duplicated lines. uniq will emit only the duplicated lines, but then I wouldn't get one line for each login account using a duplicate directory.

To get the names associated with the directory, the script outputs to a file that holds all of the duplicate directory names. With those duplicates known, it is a simple matter to tell egrep to search through the passwd file for any line containing the offending directory path. Every directory named in the duplicate file has a leading and trailing colon delimiter, as used by /etc/passwd. These delimiters create a whole-field match. They prevent matches when an upper level in a tree is a home directory, and they prevent matches to subdirectories in that same tree. For example, if some duplicate account error is in /usr, the program shouldn't output every account having /usr in its home directory path. Only those entries showing a home directory of :/usr: will match. egrep -f identifies the duplicate entries file so that egrep will match any of the duplicate entries for every /etc/passwd line.

Matching lines representing the duplicate entries from the passwd file are passed to cut, which extracts the user's account name and home directory. sort receives this information and orders the records by the home directory. sort puts all of the names together within a single home directory, then orders the lines by name within that single home directory. Finally, to make the output a little easier to read, tr changes the colons to tabs.

chkduphome's output remains unadorned so that it can be piped into another program. That other program could parse the output based on the tab between the fields. Of course, a cron job could simply mail the output to the administrator or group manager if no further processing were needed.

Catching Idle Users

User workstations idle for too long may represent a security problem. Such a user may have left the login active but may not be at the station. If a screen-saver program is active with a password-protected lock, there is probably nothing to worry about. However, when a user is idle for more than a week or two, I start to wonder if the user went on vacation without logging out.

The idleuser script (Listing 2) gives a list of overly idle users. For this program, hours or minutes of idle time aren't important, only days. The finger(1) command gives sufficient information in a table format. Unfortunately, finger doesn't output convenient tab separators between its table's fields. Fortunately, finger's fields appear to use uniform character lengths. So I give cut those character position ranges to extract the user names and their idle times, just being sure to include some separating spaces in the data's character ranges.

When finger sees an idle time of one or more days, it shows a "d" after the number. If I tell egrep to look for one or more digits followed by that "d," it will pare down the finger list to just the overly idle records.

awk does the rest of the work with the pared list. awk parses the login name field from the idle time field. Remember that awk sees everything as strings unless you explicitly use a value in a numeric expression. When a value contains anything other than a digit, you must tell awk to convert it to a number. The "d" at the end of each idle days number prevents awk from seeing the value as number. String comparison is inappropriate for comparing the days number with the idle days threshold coming from idleuser's command line. As a string, "1d" would come before "2d" but "9d" would come after "10d" because the "9" digit comes after the "1" digit. Adding zero to the value causes awk to take the numeric value of the string, throwing out all nondigit characters after the last digit, and use the result as a number. awk will compare the numbers correctly.

The program shows the names and idle days of only those users idle for more than the command line's threshold number. If you think that 7 days, 14 days, or even 30 days of idle time is fair, you could use "idleuser 30" as your command line or in your cron job. idleuser assumes that you're not interested in any user idle for fewer than those days. The script reports anyone idle for that number of days or more so that you can decide whether it's a problem.

In Search of Ancient Accounts

Using finger to find idle accounts immediately suggests using finger to identify ancient accounts. Sometimes accounts just drop off, and, in a busy job shop with lots of special access and consultant accounts, it is very easy to leave old, unused accounts on the system. Such an account represents a security issue because anyone who once had access to that account could still get in. idleuser used finger's summary output to find information about idle time in days. The oldacct script (Listing 3) also uses finger but requests more verbose information about each user.

When you give finger a user's name, even several users' names, it delivers lines for each user, identifying various facts about that user. finger takes most of those facts from the /etc/passwd file, along with the /etc/utmp(4) and /etc/wtmp files, where who(1) and login(1) keep their latest data.

Because finger gives verbose information only when given each user's name individually, oldacct must look through the passwd file to learn the users' names. Every possible account name is in the passwd file, including administrative user accounts and pseudo-user accounts, which may not get used for a very long time. Such accounts shouldn't figure into oldacct's checking. Typically, administrative accounts have low UID numbers. These numbers reside in field three of the /etc/passwd file. On the system for which I originally wrote the oldacct script, the threshold UID was 200. Administrative accounts had UIDs less than 200; regular user accounts used 200 or greater. You'll need to tune the oldacct script to reflect the lowest regular user UID your system uses.

The oldacct script must convert the month names, as finger delivers them, into month numbers. A function, monthnum, handles the conversion. monthnum sets up an associative array between the month names and their numbers. Given the name string, it returns the number. The script also includes code to show a way to turn a month number back into that month's name. That conversion isn't needed, though, so it is disabled.

Expiration dates derive from the current date, so the system delivers the current month, day, and year. A four-digit year matches finger's use. Using 90 days as the age threshold requires removing three months from the current date to get the expiration date. The script checks whether the current month is between January and March. If so, removing three months would wrap around to the previous year, delivering a negative number that would require an additional arithmetic operation. So instead of subtracting three months, the script adds nine months (January (month 1) becomes October (month 10), which is three months earlier), then reduces the year number. When the current month is April or later, it simply subtracts the three and leaves the year alone.

Although three months is considered 90 days without regard to the actual days elapsed, the day number will play a small part in the comparison. Because most months have 30 or 31 days, the day isn't too critical except for February, which may be off by as much as three days. So, if the calculated expiration month number is February and the expiration day number is past the 28th, I force it to be the 28th. Otherwise, I leave it alone. That's close enough for this rough work. If you need more precision, set up another array to deliver the correct total days for every month and include the February leap day.

Before the program begins its search for old accounts, it must decide which users to look for. awk passes through the password file using a colon as the field separator. If the record's UID comes after the threshold for regular users, it emits the user's name. All these names are collected into the USERS variable. With this variable set, the script can call finger once for each user name.

finger can match its arguments either in the login name or in the passwd file's GECOS field, which spells out lots more detail about the user. The GECOS field is informative when you are using finger interactively. You can ask for another user's first name or last name and get the rest of the information, including the login name. When you know very little about the person, this feature can help you discover more. For the script's purposes, a problem arises when a user's login name is his or her first name and several other users have the same first name. finger lists all of the users with that name, not only the one I want first. finger's "-m" option forces it to match only the login name.

If finger has trouble with that user's data, the script sends its error messages to the bit bucket. Otherwise, the fourth line contains the last login date and time. sed extracts that fourth line; the "-n" option suppresses sed's default printing.

The script's subsequent actions depend on the number of words in the fourth line. finger's output is not completely consistent. When the fourth line contains more than five words, that line has the information that the script wants. When there are fewer than five words, finger is indicating something special about the login. The shell's set command parses the words, making word counting and isolating simple.

One special fact finger can discover is whether the account has ever been used. If the user has never logged in to the account, the oldacct script must report that account name. The original version of the program simply output each user's name followed by whether that user was beyond the ancient account threshold. By overprinting, the script showed only those ancient accounts.

So much for best-laid plans. When I ran that version on the first test system, I found too many never-used user accounts, and these tended to scroll the list. I did want to know about such accounts. To correct the scrolling problem, I decided to separate the unused accounts from the used accounts and report the unused ones in a list at the end of the regular report. So, I commented out the simpler echo code and created a NOLOGIN variable to hold the concatenated list of never-used user names. At the very end of the script is the code to print out that variable's value.

Another problem came up when the GECOS field included lots of information. finger analyzes the GECOS field looking for certain characteristics. Most of the information appears on the second line, while the home directory and the default shell show up on the third line. When a home phone number was included, finger put it on the third line and dropped the home directory and default shell information to the fourth line, happily fouling up all of my line-oriented assumptions.

I couldn't avoid those line-oriented assumptions because finger doesn't apply a uniform heading to the line containing the login information. Sometimes it says "Last login," other times it says "On since," and, as already mentioned, it might say "Never logged in." These appeared most frequently on line four. Since I didn't want to run finger too often, I thought it would be easier to focus on that fourth line, and correct the program's assumptions when the fourth line was fouled. For the home phone number, the first word in the fourth line is "Directory:" so the script tests specifically for that. If it finds that word, finger is reexecuted to extract the fifth line. The set command parses the extracted line, and the while loop reexecutes using the newly extracted line.

Another special case appeared when there was no phone number in the GECOS field. When this happened, the third line contained the relevant data, not the fourth. (Account creation consistency can be a wonderful thing -- try to set your GECOS fields uniformly.)

If fewer than five arguments appeared, the user had never logged in. The while loop's test for fewer than five arguments has already accounted for the ancient account. A separate less-than-four test following the while loop continues with the next user name. All other account dating lines are five words or more.

If the first two words are "On since," the user is still logged in. Similarly, the fifth word is "Idle" when a user is logged in but idle. finger doesn't always show "On since" for the "Idle" case. These aren't ancient accounts so the script continues with the next user.

When all these tests pass, the current line contains the last login date. finger's fourth line contains the words "Last login" followed by the date when the user last logged in, including the day of the week, either the time or year, and which tty or pty device the user last logged in on. The fourth, fifth, and sixth fields contain the relevant date information. The OLDMONTH, OLDDAY, and OLDYEAR variables hold those date values. If the last login was within six months, finger shows the time in the sixth field; dates older than six months show the year instead. The easiest way to isolate that case is to look for a four-digit number. If the OLDYEAR variable has four digits, it can't be a time and so the account must be ancient. Otherwise, the script doesn't get off so easily.

If the month finger reports is less than the expiration month, this is an ancient account. What if the finger month is really old? If the expiration month is January, the finger month could be December and not be less than the expiration month. Because the expiration month is January, the current month must be April. December is still within six months, so this won't be caught by the four-digit year test. Consider that December (12) comes after April (4), the current month when this case happens. If either happens, this is an ancient account. Finally, if the finger month is identical to the expiration month, the script compares the account with the calendar day. An account is not ancient unless $OLDDAY is prior to that calendar day.

By the time the username loop finishes, all ancient accounts have printed with the user name and the last login date separated by a tab. As with the previous script, the format isn't necessarily pretty, but another program can easily parse it. If you want prettier output, surround the entire for loop with parentheses and pipe it into awk or something else.

The script finishes by testing whether the NOLOGIN string variable has anything in it. If it does, the variable contains a set of words, each word naming one user who never logged in. pr(1) formats that list into a set of columns if the list is delivered with each name on its own line. tr(1) translates the spaces into newlines. pr formats the lines into six columns, truncating the names as needed to fit. The names are not sorted, so they appear in their /etc/passwd order. If you'd rather sort the names, pipe tr's output through sort before piping the result to pr.

Stalking the Wily Aging Report

Unless you're the only user on your UNIX system, be sure to password protect all accounts and enable aging on all passwords. Password aging methods vary from one UNIX system to another. On SVR4, for example, the aging information is kept in the /etc/shadow(4) file as a set of colon-delimited numbers. However, HP-UX keeps the aging information in the /etc/passwd(4) file. Aging values are encrypted as base-64 numbers and concatenated to the encrypted password, separated from the password by a comma. The problem with this method is the lack of an easy way for anyone to review the settings. Without a simple way to review the aging settings, administrators over time have found it easier to leave aging unset than to figure out the proper settings and implement them.

The chkaging script (Listing 4) identifies the aging settings for the HP-UX /etc/passwd file. The script also offers translation features. Someone could deliver an encrypted aging value and find out what the aging numbers are, or deliver aging numbers and discover the correctly encrypted aging value. With this script, administrators and managers could review and properly set aging on their systems. (We eventually found a script from the HP-UX community that also set the /etc/passwd file. Many features in that script would have been included in the next incarnation of this chkaging script.)

Implementing a base-64 translation in shell script was an interesting exercise. Unfortunately, it got even more interesting when I discovered that HP-UX stores the digits in reverse place-value order. For instance, the decimal value 12 is one in the ten's place and two in the one's place. Written in reverse place-value order, that same number is 21, which violates the law of least astonishment. For those interested in the correct numeric way to handle the base-64 digits, I've left the original code, but commented out the lines.

Aging values come in three parts. The first part is the maximum number of weeks a user may continue using a password until the system forces a change. This maximum value uses only one character, so a password must be changed within 63 weeks from the date of the last change. You may require users to keep the same password for some minimum weeks. The next character stores that minimum time before a change. As with the maximum, the minimum may be as few as zero weeks. Such a user could change the password again immediately after changing it. As much as 63 weeks could elapse before the system allows a change. All remaining characters define the base-64 week number when the last password change was done. This week number is the weeks elapsed since 1970, where zero is 1970's first week.

NOW_WEEKS holds the number of weeks elapsed at the time the program runs. At first thought, this is the number of years since 1970 multiplied by 52 plus the week number of the current date. However, that's not really good enough. A 365-day year divided by a 7-day week yields 52 weeks and one day. For most quick and dirty estimations, 52 weeks is good enough, but every seven years, this Q&D calculation loses a week. With over 20 years having passed since the 1970 epoch, the Q&D calculation loses over three weeks. You can correct the formula by finding out how many seven-year periods have elapsed, then adding one week for each of those periods. So, NOW_WEEKS precalculates the number of years, then reuses that number to calculate the main weeks number and the fractional correction.

The weeknum function converts a base-64 number into its decimal equivalent. weeknum uses the DIGITS variable to hold the base-64 digits and associates their place values with each digit's subscript number. This function was one of those that needed changing to accommodate the reverse place-value order. Originally, a for() loop scanned across the base-64 digits from left to right. That for() loop remains commented in the code for reference, but I replaced it with a do...while() loop to scan from right to left. Shifting the 64-weighted decimal values is identical despite the digit scanning order.

The code_val function also required changing to handle the reverse place-value problem. Values are passed to this function as a comma-separated list. Because /etc/passwd requires the maximum and minimum values to be single character values, though a user could pass test values of any amount, the awk program range-checks them. Encoded values show as a set of characters, such as "oOAH" where the "o" is the max, the "O" is the min, and the "AH" is the weeks elapsed. The code produced can simply receive these characters concatenated. Because the max and min are single characters, awk's substr() function is sufficient.

The elapsed value requires base-64 decomposition. awk doesn't have Boolean AND or bit-shifting operators, so division and remainder operators are needed. These operations don't change with the reverse place-value order, but the concatenation sequence does. The original version showing the mathematically correct digit construction, appending the base-64 number to the latest digit's value, remains as a comment for interested readers. To get reverse place-order sequence, append the latest digit to the base-64 number constructed instead. The final base-64 number, whatever the place-value order, is concatenated to the max and min codes and emitted from the code_val function.

aging_val converts an encoded aging value into its decimal form. aging_val uses expr(1)'s colon operator to extract the max, min, and elapsed substring codes, and the weeknum function to show their decimal equivalents. One key issue is that HP-UX's /etc/passwd specifies that a missing elapsed code is equivalent to a zero value. To make the code regular, aging_val appends the ".." zero value to the code. Finally, in displaying the time elapsed since the last change, I chose to show how long ago the password was changed. This number is dynamic. As time passes from week to week, running the chkaging program will show different values for the same account.

show_aging is the default operating function for the chkaging program. It extracts the user names and their encrypted password fields from the /etc/passwd file. chkaging assumes that administrative accounts will have their logins explicitly disabled and will show an "*" instead of a password. It ignores such accounts. Otherwise, each name is echoed with its aging information. The shell's set command parsing mechanism separates the aging information from the encrypted password. There is no comma in the 64-digit encryption character list.

Originally, the script simply output the warning that aging was inactive next to each login name. Soon, though, I ran into a system where almost every account had no aging. To simplify identifying such accounts for repair, I decided to accumulate the names into a NOAGING variable. After showing the active user list, I display the NOAGING names all at once in a columnar list. Once the NOAGING variable's handling and printing were in place, I commented out the two original lines.

The main processing routine decides whether to call show_aging to review the /etc/passwd file, to analyze the encrypted aging values (the "-a" option), or to analyze the decimal week code values (the "-d" option).

Every UNIX system administrator must set up and keep track of many simple security measures. Simple scripts can automate these tasks. Eliminating root access requirements allows non-administrative users to monitor their portions of the system. Distributing the monitoring of simple security operations lets the administrator pay attention to the complex jobs.

About the Author

Larry Reznick has been programming professionally since 1978. He is currently working on systems programming in UNIX, MS-DOS, and OS/2. He teaches C, C++, and UNIX language courses at American River College and at the University of California, Davis extension.