Web-Based Printer Management

Brett Lymn

Some years ago, pundits predicted that there would soon be the paperless office -- an office where all communications and reporting would be performed electronically without consuming paper. At the time of this writing, I see little evidence of this happening, but I have seen the opposite -- more printers with more capabilities are being added to the office. The burgeoning printer population can be problematic for administrators when users want help removing print jobs from the printer queue.

On a UNIX system, users can kill their own print jobs if they have the correct tools and know-how. Unfortunately for the administrators, users sometimes can't kill the jobs without help (e.g., a user may be locked into an application and have no way of removing the print jobs). The printer problem becomes amplified when you have back-end services running on multiple machines talking to various printers, or a sys admin who may not be able to immediately kill a print job.

Background

One approach to the problem is to use something such as sudo (a program that can allow specific users to run specific programs as root) and give select users the ability to remove jobs from the print queues. This requires that the users with this access have the technical skills to find the right machine, find the offending job on the print queue, and kill it. Unfortunately, even when this task is scripted, most people view the process as somewhat cumbersome. Additionally, if you have many back-end database servers, you will need to grant the people managing the print queues access to the machines and also set up sudo to allow them to kill the print jobs. The task of maintaining these privileged users and the associated sudo setup becomes even more cumbersome as the number of back-end servers grows.

Most people are familiar and comfortable with a Web interface, and it makes sense that being able to manage printer queues from a Web interface would simplify printer management. The straightforward approach is to wrap some CGI scripts around the standard lp commands and leave it at that. However, this approach has some problems. One approach is for the Web server hosting the printer management scripts to know about all the printers on all the back-end database servers so that the lp tools will be able to manage them. This, in itself, is a burden because the admin must remember to add the printer not only to the back-end database server, but also to the WWW server. Another problem with this approach is that if you want your Web interface to be able to kill print jobs, you must have a setuid root script or run the Web server as the user to which the back-end databases submit their print jobs. Both of these approaches have major security problems (especially running setuid root scripts from a Web server).

Solaris lpd

With these points in mind, I looked at how I could simplify printer management. I knew from prior experience that the line printer daemon (lpd) could be used to query a remote host about the status of a printer. As stated before, the normal tools require you to have a printer queue set up on the machine but reading RFC 1179 indicated that you could send a specially crafted packet to the remote host's lpd port to inquire about the status of a queue on a printer. This meant that I could get around the requirement of having the printers defined locally by simply talking directly to the printer port on the remote host using the lpd protocol. This lead to another discovery about Solaris.

On Solaris, the standard port for the lpd to listen on is defined as port 515. On a UNIX system, because this port is below 1024, it is considered a privileged port and can only be opened by root. This really only affects the lpd itself; you can connect to the port as a normal user. The quirk here is that some implementations of lpd actually check the port the connection is coming from and reject it if the source port is not the printer port (i.e., port 515). I think this was originally a method of preventing a normal user from connecting to the lpd and manipulating the queue because only root could open the printer port and therefore could be trusted. However, this scheme relied on everyone obeying the rule that unprivileged users could not open tcp ports below 1024. (This is not the case these days, which makes the source port checking a bit pointless.)

The Solaris lpd implementation does not check the source port of the lpd connection and, hence, will allow a normal user to manipulate a print queue via the lpd protocol. In my case, this was an advantage because it meant that I did not need to have any setuid root scripts to manipulate the print queues on my Web server. This could be a nightmare for others because a user with sufficient skills would be able to modify the print queue remotely. Note that you cannot just telnet to the printer port on the remote host and manipulate the queues due to the way the lpd reads the data from the network connection.

Writing the Script

Given that I had now satisfied my criteria, (not requiring the printers local on the Web server and being able to manipulate printer queues without root), it was time to start coding. I already had a Web server running Apache with the mod_perl extension, so instead of doing a normal CGI script, I was able to write one that worked with mod_perl. (See http://www.apache.org/ for information on both the Apache Web server and the mod_perl extension.) I wanted a script that took the machine and printer names as arguments and would then query the remote machine for the queue of the given printer. If there were jobs in the queue, then these jobs would be presented in a scrolling list. The user could select the jobs from the list and press a button to remove the jobs.

Some classes are imported to provide the functionality we need. The IO::Socket class is used for the network connection to the lpd on the remote host. The other classes import the necessary parts to deal with HTML and extracting parameters from the arguments passed by the Web server. For the moment, skip over the get_status and cancel_job functions and look at the mainline code. The first important bit is:

my $r = Apache->request;

which creates a new Apache request object holding all the information the server needs to handle requests. When the script has the request, it sends out a standard text HTTP header using:

$r->content_type('text/html');
$r->send_http_header;

The browser will now have been sent the header and be ready for the rest of the page. The script then grabs the arguments that were given to it, which is done by importing the names like this:

CGI::import_names('PARAMS');

This imports the variables into the PARAMS name-space, which is a security measure to ensure that command-line arguments cannot be used to overwrite internal script variables. Using the PARAMS name-space, we grab copies of the remote machine name ($print_host) and the remote printer name ($printer). Once this is done, the script outputs the start of the page and starts an HTML form. The script then stashes a copy of the remote host name and the printer name in hidden fields in the page by doing this:

$r->print(hidden(-name => 'hidden_print_host', -default => "$print_host"));
$r->print(hidden(-name => 'hidden_printer', -default => "$printer"));

Apache normally forks multiple copies of itself, because you can't be sure that you are talking to the same Apache process on a UNIX machine when the user hits a form button. This means that keeping state between the generation of the form and the post of the form data cannot be done in the script itself, hence the hidden fields on the form page. Hidden fields are not a good idea in a security critical situation but do the job nicely in this case.

The script then checks whether it has received any values using one of the query submission buttons; more on those later. The script has two buttons to look for -- a delete button and a refresh button. If either of these buttons has been pressed, the script retrieves the value of the remote host name and the printer name from the hidden fields where the script initially placed them. The script just retrieved those parameters from the arguments a few lines ago, but remember that if a button has been pressed, the script is processing a form submission (not generating the original form). Thus, the script cannot trust the original arguments, because they may be for an entirely different machine and printer. When we know the printer and host that the script are dealing with, the script prints a nice heading at the top of the page using:

$r->print(h1("Printer Management For Printer $printer On $print_host"));

The script then checks whether the refresh button was pressed or whether the delete button was not pressed; the latter implies this is the first time the form has been generated, which happens when the user clicks on the link to load the form page. In this case, the script calls the get_status function, which simply opens a TCP/IP connection to the remote host and sends a message to the remote lpd to return a short queue status for the desired printer. The get_status function reads the reply from the remote lpd and puts the results into an array and returns it to the caller. In the mainline code, the script checks the first array entry. If it is empty, there were no jobs in the queue. Rather than printing a blank scrolling list, the script simply prints a message telling the user there were no jobs in the selected print queue. If there were jobs in the queue, then the script generates a scrolling list using:

$r->print(br, scrolling_list(-name => 'job_list', \
          -values => [@jobs], -size => 10, -multiple => 'true'));

The br is simply a line break and is used to position the scrolling list; the actual list is done by the scrolling_list method. We give the scrolling list a name so the script can retrieve the selections later. "values" is the array of jobs that the script fetched using get_status. The list-box has a size of ten lines, and multiple list selections are allowed. If there are jobs, then the script needs a delete button to delete them, which is done with:

$r->print(p,p, submit(-name=>'Delete', -value => 'Delete'));

This inserts a couple of paragraph breaks to space the delete button away from the scrolling list and creates a button labeled "Delete" and a value of "Delete". This is the value passed in the form submission if the button is hit. This is how the variable $button gets the value "Delete" earlier in the script. After the delete button is done, the script skips to the end and puts in a refresh button in a similar manner. It then closes off the form and the HTML page. The form is now rendered by the client browser, and the script waits for the user to press a button. If the user presses the refresh button, the script will go through the process just described, which results in the presented queue entries being updated. If the user hits the delete button then another code path is followed. In the case of a delete being processed, the script first retrieves the selections made by the user by getting the job_list parameter (the name given to the scrolling list) and puts them into an array. The script then processes the list of jobs and parses out the job identifier from the queue entries that were selected. If no jobs were selected for deletion, then the script simply complains and does nothing else. If some jobs were selected for deletion, then the array of identifiers is passed to the cancel_job function, which opens a TCP/IP connection to the remote host and sends the cancel command for the given job identifiers on the given printer queue.

One thing to note here is in the packet sent to the lpd by the script lies about the sender's identity. The script claims to be root, otherwise the lpd will refuse to kill any print jobs that are not owned by the same identity passed in the command packet. Root is allowed to kill any job. The lpd accepts that the identity of the user running the script is root because the lpd protocol assumes that it is talking to a well-behaved client who never lies.

The cancel_jobs function checks the reply from the cancel request and if there is any reply, then the cancel did not work and the function returns a false status. This is used back in the mainline code to give the user confirmation that the cancel happened or that it failed for some reason. Once the script has processed the deletions, the script will output the code for a refresh button so the user can refresh the view of the queue, and possibly delete more jobs.

Now that the script is done, we need to put it onto the Web server so other people can access it. Configuring Apache to run a mod_perl script is beyond the scope of this article. (Check out the Apache Web site for detailed instructions: http://perl.apache.org.) I placed the script into the appropriate directory and told Apache that it was a mod_perl script and should be executed with the CGI emulation. Once the script was installed, I set up another script that automatically generated a Web page with links for all the printers on the remote machines to the printer queue management script like this:

<A HREF=http://webserver/perl/printer_queue.pl?print_host=server1 \
  &printer=headoffice>Manage the Head Office printer queue</A>

I then pointed our help-desk people and a few others to the Web page and went on to the next task knowing that I would no longer get phone calls requesting emergency print job deletions.

Note that it is necessary to create access to a lpd remote printer on the remote machines that are going to have their printers managed for this scheme to work. The lpd service on the remote machine is not set up unless you have done this. I normally just use the admintool to create access to an lpd printer; this only needs to be done once on each remote host.

Conclusion

The script I have shown here can create a security risk because it is used in a restricted environment by trustworthy users. Access to the script should be restricted by using an access password to prevent accidental print job deletion. The use of hidden fields in the form can also create a security problem. In this case, there is no real danger but a user could view the hidden fields simply by viewing the HTML source and could possibly change the values in the hidden fields before the form is submitted.

Brett Lymn works for a global company helping to manage a bunch of UNIX machines. He spends entirely too much time fiddling with computers but cannot help himself.