Article

Using Your Web Browser as a GUI

Arthur Donkers

In the "past," a word that has a very relative meaning in computer terms, most programs had a user interface based on some sort of text terminal and displayed their information in an 80 x 24 character screen. However, a few years ago a new trend started emerging. Graphical user interfaces were a hot item and, indeed, they provided the user with a clear presentation of information. Now we see a trend toward sticking a "graphical face" on any application, whether good or bad. But what about all those existing character-based applications? Or the ones users have written themselves and are so accustomed to? No problem, in this article I will show you how to use your World Wide Web browser as a graphical user interface to a program running on your UNIX machine.

Introduction

The techniques presented here are not new. Indeed, they have already been used by, among others, SATAN, the notorious network security tool. Another example is Merlin, a graphical front-end to a number of network security packages, such as COPS. So, you can also use the sources of these programs as a reference.

Why should you even consider using a web browser as your user interface? First, WWW browsers are available for a great many platforms, ranging from Microsoft Windows to UNIX to OpenVMS. You can "use" your user interface on all of these platforms, regardless of their operating system. This proves a great benefit, especially in a multivendor environment in which you have to offer a user interface to different users on different machines. All of your browsers, however, as well as your own program, must adhere to one of the available HTML standards. (A number of versions are available see the HTML sidebar for more information.) In my examples, I will stick to Netscape, because it is one of the most widely used web browsers.

A second reason for using a web browser as your interface is that web browsers are written for obtaining web pages over a network connection. The browser has all the necessary network software built in, so you don't need to program that yourself. This network-based setup enables you to separate the user interface from the actual application. In this way, you can isolate changes in the user interface from the actual application code and save time and effort maintaining your application.

The last, but definitely not least, reason is the ease of programming. It is much easier to write a user interface in HTML than a comparable one in Motif or Win95. You can also test features very quickly, because HTML is interpreted by your browser, and the HTML source is the code as well.

Writing your user interface in HTML also has some drawbacks. First, HTML is not (yet) as powerful as Motif. You cannot create as sophisticated an interface in HTMLas you can in Motif -- with all its widgets for buttons, scrollbars, and such. You have fewer basic elements to choose from. Another drawback is that all communication is done across the network. This creates some special security complications, because all data could be sniffed when traveling over the wire. A possible disadvantage of this network connection is that you will need to write special network programs to be able to access datafiles that are part of your application. However, if these drawbacks are not a problem for you, you have a very powerful tool with which to build your graphical user interface.

Our Setup

All of the programs in this article were written in perl (version 5.001m) or C. The advantage of using perl is that it combines the power of a shell script (interpreting code) with a structural programming language like C. It has built-in support for network sockets and a lot of other extras that you can use from one consistent programming environment. Furthermore, perl is an interpreted language (with an invisible precompile phase), which speeds the development process. I used Netscape (version 1.1N) running either on a SPARC with Solaris 2.4 or on a plain PC with Windows 3.1.

The program presented here is not a so-called CGI script (a program executed by the web server). It is a very dedicated web server in its own right. It will dynamically generate pages and send these to the browser. The generated pages will contain links to other pages, either regular web pages or new dynamically generated pages. For the user, there will be no noticeable difference between the dynamically generated pages and the regular web pages.

The Basics

The basis for communication between the browser and a server is the Uniform Resource Locator (URL). A URL is sent by the browser to a server and is a request for information. The information, if present, will be sent by the server, and the browser will interpret this information. If applicable, the browser will display the information, and the user might give a new request. These URLs are addresses of the form: http://www.reseau.nl. A more elaborate example might be the following one:

http://www.reseau.nl:1305/index.html.

The URLconsists of three parts. The first part is http://. This tells the browser how the data resulting from the request should be interpreted. An http request will result in information in the HTML format. Instead of http, you may also specify ftp. This tells the browser the information sent back is in fact a file that should be stored without interpretation. So, this first part will define the format of the data.

The second part, in this case www.reseau.nl:1305, tells the browser how to contact the machine on which the data is stored. The nodename is www.reseau.nl, and the network portnumber is 1305. In "normal" requests for regular web pages, this last number might be missing. If so, it is defaulted to the http portnumber, which in most cases is 80. In the case of the GUI through WWW, this portnumber will be used to start your own server daemon. The browser can then contact that program via that portnumber.

The last part, index.html, determines the name of the file requested. The server will open this file, stick a header on it, and send it to the browser. The browser will then, in this case, interpret it as a HTML document and display its contents to the user.

The Browser Request

The URLshows how a user enters his or her request, but how is this request transmitted? There is a simple way of finding out. You can write a simple program that will listen on a free network port and display all the data that is received. The program itself will come back later in another form, but Listing 1 shows you the request sent by the browser. All data is sent in ASCII format. The generating URL is:

http://ra.reseau.nl:1307/index.html.

The first line of the listing determines the type of request, in this case a GET. This tells the server that the browser wants data to be sent back. Another type of request is POST. In that case, the browser will send user-entered data to the server, and thus create some sort of user interaction. The command word GET is followed by the name of the requested file: /index.html. This document is located in the root of the documentation tree served by the server. During the configuration of the server, you can specify where this root lies in the filesystem. The last part specifies the protocol used in communications, in this case, HTTP version 1.0.

The next lines tell the server something about the capabilities of the browser. The first tells the server which browser is used, in this case Netscape running on Solaris 2.4 on a SPARC station. The GUI program could use this information to adapt its output to the kind of browser used.

The four accept lines tell the server which format of information the browser can digest. In this case, it can accept GIF, JPEG, and X bitmapped images. Any other kind of format for text is welcome (the first accept line with */*).

The Program

The GUI program will need to read a request from a network port (socket), filter the appropriate information, and do something useful with this request. The perl program that does all this is shown in Listing 2. It is the complete program, and I will explain the different parts in the next paragraphs. This is an experimental program to show the techniques involved; you will need to polish it if you want to use it in a production environment.

The program, which begins its execution at the comment # Start here, can be divided into several parts. First, it will create a network port (socket) on which it will listen for incoming browser requests. The portnumber is determined by the service name wwwgui. This name should be present in the /etc/services file on your UNIX machine. This file will relate the name to a portnumber you have choosen. Make sure that all clients using this server program use the same portnumber. The subroutine create_socket takes care of this.

Once the socket is created, it can start waiting for a connection and an incoming request. Note that a web browser creates a new connection for each request it issues. You do not need to save connection information between different requests, which simplifies your server design.

Handling a Request

When the server receives a request, the request is analyzed, and all relevant data is extracted. The first line is read to determine the type of request, and thus the processing that will be involved. A GET request should be handled differently from a POST request. This processing is done in the subroutine do_req, which is the heart of our server program.

GETting Information

In the case of a GET request, the name of the requested file is also extracted. In this program, that filename is used as the name of a script that should be run. If no file, or a nonexistent file, is requested, a default script should be provided to run a default program or to display some (error) information from that default script.

The output of these scripts is sent directly to the browser, so it should be in HTML format. The best thing to do is to write a (perl) wrapper around your existing application. This wrapper will translate the output to HTML and send it to the browser. The server program will call this wrapper program whenever a user opens a link to it. Instead of perl you might also use expect for these wrapper programs. This expect program is available through the Internet and is especially written for making (programmable) wrapper scripts around interactive programs.

POSTing Information

I have showed how you can GET data from a server, but to be able to interact with your programs, you must also have a means of entering data. This input should be sent from your browser to the server. The server will then hand over the input to the actual application. Luckily, HTML has a special tag, called a FORM, which allows you to enter data and send it to the server. For regular web pages, this tag is used to send data to the webmaster, or subscribe to a magazine, and so forth. In this case, the data is sent to the server that will call a CGI script to do the actual data processing. However, the server in this article will process the POST request directly. Ineed to know what that request looks like, so I wrote a special script (Listing 3) that displays a FORM in your browser. Figure 1 shows the browser and the entered data. By logging all data sent to the server, I deduced the format of the POST request. This is shown in Listing 4.

As can be seen from Listing 4, the first line describes the kind of request, in this case, a POST request. The argument in this line is the script that should handle the processing of the data, normally a CGI script, but for this server it might as well be a perl program. A number of extra informational lines are present. The next to last line gives the number of characters in the last line, and the last line is a concatenation of all input fields in the FORM. Characters are translated so no spaces will occur in this line. All spaces in the input are replaced by the "+" character, and other characters are escaped by a "%xx" representation to guarantee one string without spaces.

All input fields of the FORM are separated by the "&" character. Note that the first three input fields are defined as "hidden" on the form. This means the user does not see these fields and cannot enter data into them. These hidden fields are normally used by your script to enable better processing.

The do_req function in Listing 2 handles the reading and processing of these input fields in a POST request. First, the dataline is split up by using the "&" as a separator. The different fields are stored in an array. Then each element of the array is processed so that all the special characters are translated back into their original meaning. You end up with an array in which each element corresponds with an input field in the original FORM.

Once you have received and processed the input fields, how do you transfer these to your program? This depends on how you call your program, but one of the simplest ways is to use the input fields as command-line arguments when calling your program. It is then up to your program to handle (or ignore) these arguments and use them as input for whatever you need.

Authorization

So far I haven't bothered about authorization. Anybody can call up any script; there is no restriction. In a real-life production environment, however, you might want to restrict the access to certain scripts. Luckily, the HTTP protocol has a means to do this.

The server determines whether a browser is authorized to access information. It maintains a register of which scripts are "publically" accessible and which need some form of authorization. Whenever a browser issues a GET request for a restricted script, the server will respond with an error message. This message is interpreted by the browser, and if applicable, the browser will display a window in which the user must enter a username/password combination. An example of such a window is shown in Figure 2. The message a server sends when a browser is not authorized has the following layout:

HTTP/1.0 401 Unauthorized to access the document

This format strongly resembles the way the ftp daemon handles communication errors. It will send back a string with an error number and a descriptive error string. When the browser has read the username and password from the user, it will resend its request with the extra authorization information. The browser will encode the username/password combination in a sort of uuencoded format; the server will decode it and check it against its internal registry. If correct, the server will execute the script and, thus, make the information available. Note that the browser saves the entered username/password combination and adds it to all subsequent requests, whether the server needs it or not (see Listing 5).

I wrote the decode program for the authorization information in C. It is based on the sources found in the CERN and NCSA server daemons. The source for this program is shown in Listing 6. The server program maintains two tables that contain the authorization information called, authdb and passwd. The authdb file contains the names of scripts and users who may call them. The passwd file contains the unencrypted passwords for all users in authdb. An example of the authdb is:

form.pl:arthur

An example of a passwd file is:

arthur:ZegIkNiet

As shown here, the server does not provide for a very secure setup. You can edit the source of the program, however, and adapt it to a more secure environment. In this respect, it is worth noting that the username/password combination is not strongly encrypted. Anyone with a network sniffer tool can catch the password and decode it. The same goes, however, for a telnet session in which the password is transmitted across your network completely unencrypted. The browser, or at least Netscape, has a stronger encryption method, based on RSA, but it is beyond the scope of this article to elaborate on that.

The program as presented in Listing 2 has all the hooks for doing the authorization checks. It has not been fully implemented, but you can build onto it to complete it.

A Short Example

Finally, I would like to show you a short (and thus limited) example of how we applied these techniques in one of our own programs. I use this program to initiate a dial-up connection to one of our Internet providers. When we start the initial program, it checks to see which dial-up scripts are available and will present these in a list. This screen is shown in Figure 3. We can add and delete dial-up scripts without any need to change the user interface. Furthermore, we can initiate a dial-up connection from any machine on which we have a browser session. (Note that the actual connection is made by the machine on which the server software runs.) The code for this program is shown in Listing 7.

When you click on one of the links, another script is executed (I stripped a lot of code from it for brevity). This script will make the actual connection. Its progress is shown in another web page, and each time the script prints something to stdout, it will appear on this page. In this way you can keep track of the progress of the executing script. The code for this script is shown in Listing 8, and the page that tracks the progress is shown in Figure 4. With a few simple scripts you can make a rather sophisticated application.

Conclusion

The programs presented here enable you to put a graphical face on your application. With these programs you can provide a simple GUI that will work across many platforms without your having to recompile the code. You can dynamically generate pages that have the same look and feel as "normal" web pages, and you can also mix "normal" web pages, for instance, the ones that contain your online documentation, with programs that should be executed when you select a link.

I reiterate that this program is experimental, and is not suited for use in a production environment. You will need to tune it and add more error handling for user and system errors, but it shows the techniques involved and demonstrates how you can apply these in your own programs.

About the Author

Arthur Donkers graduated from the Delft University of Technology with a degree in Electrical Engineering, with a major in Computer Architecture. He has since worked for a number of major software houses in the Netherlands. His primary field of interest is data communications. especially the integration of multi-vendor networksystems. For the last four years he worked as an independant consultant for his own company, Le Reseau ("The Network").