Making a Wish: The Web-Interface Shell

Shawn Bayern

Without even realizing it, your computer-illiterate neighbor may have recently been a UNIX user, having purchased goods or services using a Web site hosted on a UNIX server. Ten years ago, users sat at command-line terminals and engaged in text-based dialogs with a machine. Increasingly, however, the honor of using command-line shells is being reserved for systems administrators, and it has become their job to protect users from errors like “Command not found” and from having to set a PATH.

Of course, e-commerce isn't the only thing that's Web-based. Developers have written multiple HTTP/CGI interfaces for email, file access, and many common systems administration functions. Now, it's often taken for granted that new utilities and intranet applications will provide Web-based interfaces for their users. And why not? Web-enabling old tasks lets users take advantage of a single familiar interface and even, in many cases, a single client program. If everyone has Web browsers and knows how to use them, why not make it the interface to everything?

Although everything's being moved to the Web, the shift is occurring in individual steps, program by program, and task by task. Little work has been done to bring to the Web the entire range of services a machine offers, all at once. Sys admins often “webify” individual commands (“Webifying UNIX Commands”, Sys Admin, October 1999) and give their users a Web-based IMAP client. But what happens if a user wants to edit an arbitrary file over the Web? How easy is it to run a command over the Web that the administrator hasn't anticipated? And what about commands where the user needs to engage in an active dialog with a program? Can they ever be offered on the Web without being completely rewritten?

A Web-Based Shell

As research I conducted at Yale University's Computer Science department, I explored the viability of a “Web-interface shell”, which I promptly named “Wish”. In principle, such a Web-based shell would allow users to perform arbitrary functions on a remote machine by using a Web browser, the same way a command-line shell gives users access via a textual interface.

The term “Web-based shell” is slightly poetic; such a “shell” isn't exactly coordinate with a traditional UNIX shell like csh or bash. It isn't, for instance, associated with a user in /etc/passwd, and it's not run when a user logs in over a text-based interface. It's simply a browser-accessible interface to a UNIX host -- a shell in the sense that it wraps the functionality of a UNIX host in a convenient interface for users, allows broad access to the programs and files on a host, and interprets commands. In short, it adapts the flexibility of a traditional UNIX shell to the Web.

What should a general-purpose Web-based shell offer? For starters, it should authenticate users and enforce access constraints. It should allow the user to navigate the directory structure of the server, view files, and run commands, all from within a Web browser. Programs like ls that simply take arguments and print output are easily brought to the Web, but I considered it important to explore the viability of running interactive programs (e.g., fsck or rm -i) from a remote browser.

Designing and Implementing a Shell as a Web Application

A prototype and demonstration of this concept was written in Perl (5.004) and tested on Linux (most extensively on kernel 2.0.36) and Solaris 2.6. However, it should run on any system that supports UNIX-style processes and IPC. The code runs successfully but was designed as a demonstration only, so use caution and test it extensively before running it in a production environment. A strong background in Perl is suggested for anyone choosing to use my prototype of Wish in practice.

Wish makes use of the common subset of functionality accessible to all popular Web browers; no JavaScript, DHTML, or even graphics are used or required. It is straightforward to add them and thus extend the user interface, but the goal was to be as portable as possible. Even text-based browsers such as lynx should work.

Early on, I decided that designing Wish as a standalone server would provide for maximum configurability and portability to multiple platforms. Think of the current implementation as analogous to telnetd or sshd -- it accepts incoming connections from users rather than running from inside a Web server. I didn't want something as generic as a shell to depend on an httpd server being installed, or for it to be subject to problems with the security of that httpd server. At the very least, the choice to make Wish stand alone reduces administrative overhead. However, any Web technology could have been used (e.g., simple CGI, JSP, PHP, etc.). The high-level implementation wouldn't even be that different, in terms of overall logic, between these technologies.

The Interface: What a Webified Machine Looks Like

Wish acts as a Web server and handles requests for URLs by displaying back dynamically generated HTML to its Web-browser clients. This HTML vended to clients contains links (<A> tags and <FORM>s) to URLs that point back to Wish. Thus, all communication between Wish and the user takes place via HTTP requests and responses.

Once the user logs in, Wish presents a view of the user's home directory as if the user had typed ls -l (see Figure 1). Every file that's displayed is a link back to Wish. If the user clicks on a file, Wish displays it for the user. By default, Wish displays files simply by showing them to the browser surrounded by <PRE> tags, but just like Windows, it can be configured to handle different sorts of files based on their extensions. For instance, man pages (those ending in .1, .2, etc.) can be piped automatically to nroff, and a C program can be displayed by a C-to-HTML converter. The sys admin can adjust how Wish displays different sorts of files by editing a configuration file.

Files that are executable have an “x” bit in their listing that functions as another link back to Wish. When Wish receives a request for such a link, it runs the program and displays the output to the user. Users can also run arbitrary commands by typing them in a <INPUT TYPE=“TEXT”> box (labeled “Run”) at the bottom of the screen (See Figure 2). Wish, like most modern shells, keeps a history of commands the user has executed, allowing simple retrieval and re-execution of prior commands. This history is displayed to the user via a <SELECT> HTML control.

Directory navigation is possible graphically, by clicking on a directory name to navigate to, or by typing a traditional cd command into the Run box.

Introduced with Wish is the functionality embodied by the “Run Interactively” box (Figure 2). This box, like the “Run” box above it, runs a program on the server and displays the output to the user. However, Wish allows the user to enter input to the program when required, and at each stage the user can see the output and enter more.

Like many Web applications, Wish tracks a user “session,” giving the user the sense that separate pages are part of the same logical unit, much the way that a user at an online bookstore can add a book to an online shopping cart on one page and see it in the cart later, on another page. Session tracking is important for a shell because it must maintain information about the user's current environment (e.g., what the current directory is, or what commands the user has recently run).

Session tracking is implemented without cookies to provide maximum flexibility for users. A browser, or user, does not have to accept cookies in order for Wish to function.

Securing a Powerful but Potentially Dangerous Interface

Allowing users to access arbitrary services on a machine through the Web looks, at first glance, like a recipe for disaster. What could be worse for a machine than letting users run programs via the Web from arbitrary locations? Clearly, any Web-based shell must be able to authenticate users and give them access only to the services for which they are authorized.

In the demonstration code, authentication is currently handled through a simple password validation using getpwnam(). The password is requested via a simple Web form and checked at only one point in the code, so it's easy to modify Wish to use PAM or to validate credentials using any other method.

Once credentials are validated, security depends on the session-tracking mechanism mentioned above. Once the user authenticates successfully, a “session” is created. The fact that a user's credentials have been validated is stored along with the rest of the session state. If this weren't the case, either the user would need to re-enter credentials at every page or the password would need to be sent as part of the URL or POST data. Either alternative is clearly undesirable.

Session state is recovered through a randomly generated “session key” that's changed for each request (to serialize access). This technique for session-state maintenance is often described as “URL rewriting” and is best explained by example: If user “Joe” types in his password successfully, a session is created with the random identifier “A”. (Of course, a long random number is used, not a single letter.) In the HTML returned to the user as part of the response to the request that sent the credentials in the first place, all URLs and <FORM>s back to Wish contain this unique identifier.

When Joe next chooses an action and clicks on a link or a button, the session key “A” is sent back to the server. The server looks up “A”, associates it with Joe, and is thus able to process the request in the appropriate security context. Before sending back HTML to the user, the session key is changed to “B” (a new random identifier) and recorded for later use. This process continues until Joe's session expires, which occurs after a configurable idle period. (A logout button to explicitly invalidate a session could be easily added but doesn't exist in the prototype.)

When the server determines the owner of the session, it (as a process forked by the master server to handle the incoming request) drops its root privileges and assumes the identity of the session owner. From then on, the operating system enforces all access constraints. The procedure is not dissimilar to /bin/login running as root, accepting a credential, and then spawning a text-based shell with user privileges.

To give a sys admin greater control over what users can and cannot run via the Web, Wish also provides “restricted shell” functionality. The administrator, by editing a configuration file, can enumerate a list of programs for which users have access. If a user tries to run a program not in this list, he or she will receive an error message.

While Wish isolates users from each other and preserves the UNIX security model, it does not currently take network security into account. Before being used in production, Wish should incorporate one of the commonly available SSL implementations so that Web browsers can access it securely. Without that additional security, a packet sniffer or compromised router could be used to read a user's plaintext password off the wire or “intercept” and hijack a Wish session.

In some ways, setting up Wish could actually make an installation more secure if it were used as an alternative to simple telnet, which requires plaintext authentication. Secure protocols like SSH and Kerberized telnet to access text-based login sessions are popular, particularly at universities, but clients for these protocols aren't omnipresent, often causing users to revert to insecure telnet when necessary. Most users, though, have access to an SSL-capable Web browser, making an SSL-enabled version of Wish a potentially attractive service to offer.

Programs That Can't (or Shouldn't) be Webified

Wish is clearly still nascent technology, and a few easily remedied limitations have already been suggested. There are some limitations, however, that the architecture will never be able to overcome, at least not easily. Providing a browser interface to arbitrary curses-based programs through a Wish-like shell, for instance, will likely never be worth the effort it requires.

As for handling line-based interactive programs (which, as I mentioned, was a goal for Wish) a heuristic solution is possible, but any perfect platform-independent solution remains elusive. When dealing with an interactive program, a shell must be able to determine when a program requires input or is still ready to continue producing output on its own. Most sophisticated users of traditional shells have seen this behavior before:

csh% cat &
[1] 525
csh%
[1]  + Stopped (tty input)  cat

When a backgrounded process attempts to read from the terminal, csh and other shells can handle the condition and notify the user. Text-based shells can only do this because they have help from the terminal driver, which sends a SIGTTIN to a backgrounded process (and its process group) when it tries to read from the terminal.

Wish, of course, doesn't have this advantage; no terminal driver is involved, making it impossible to rely on SIGTTIN. Determining whether an arbitrary process is waiting for input is not straightforward. Thus, Wish uses a simple heuristic to determine when it thinks an interactive program needs input. It waits for the process to stop sending output and, after a configurable delay, gives up and assumes that input is required. This approach works for normal line-based programs where users enter a command, receive a response, and enter another command (like in the traditional “hunt the wumpus” game pictured in Figure 3).

This strategy isn't perfect, but it deals with simple line-oriented interactive commands. Commands that do sophisticated things with their terminals won't run properly within this framework. Without writing a custom pseudo-terminal driver specifically for Wish (an endeavor that would be both platform-specific and daunting) it's very likely futile to hope for perfect support for arbitrary interactive programs within this architecture. (We can, however, continue to wish.)

Other Ideas for Bringing Shell Access to the Web

Writing a Web application that constructs its own interface and processes user commands isn't the only way of bringing the flexibility of a shell interface to the Web. Mindbright Technology has written a free, pure Java SSH client called MindTerm (http://www.mindbright.se/mindterm/), for instance, that can run as an applet in a browser. For sites that don't use SSH, The Java Telnet Application/Applet (http://www.mud.de/se/jta/) may be useful.

Wish's approach does, however, add a few new possibilities to Web-based shell access. For instance, Wish's interface is customizable and extensible by anyone who can program Perl and knows HTML. HTML (and DHTML/JavaScript) interfaces are popular for a reason -- most people find it substantially easier to use them than older command-line interfaces. Simply migrating a text interface to the Web, character by character, might miss an important chance to improve, rather than just repackage, the interface to UNIX hosts.

Furthermore, since most of Wish's processing occurs on the server side, older browsers can use it; the capability to view HTML and process <FORM>s is enough. For sites where a large base of users has older browsers, rolling out a version of SSH or telnet implemented in Java might not be useful.

As a single point of entry into a system, Wish may be able to localize important security constraints, messages of the day, and other administrative functions. This localization can, of course, simplify administration.

Keep in mind that the Java-based solutions are considerably more mature than Wish, which at this stage is still experimental. If you need a quick solution to let any user with a modern browser access your UNIX host, setting up a Web page using one of the applets mentioned above is probably your path of least resistance.

Using Wish, though, or writing your own version based on the idea of delivering shell-like access over the Web, can bring new convenience to sys admins and their users. Will UNIX shells ever become as natural to users as e-commerce shopping carts? Probably not. But making flexible functionality more accessible is a step in the right direction.

About the Author

Shawn Bayern is a systems programmer in Yale University's Technology and Planning group. “Wish” was developed while he was an undergraduate at Yale under the advice and guidance of Professor Stanley Eisenstat. Shawn be reached at: shawn.bayern@yale.edu.