Building a Web-based Java Application Server with Apache JServ
Don Gourley and Wei Wu
From the early days of the World Wide Web, systems developers have looked for ways to generate dynamic Web pages, especially for content based on information in databases. Application servers are now available that integrate directly with Web servers to extend Web server processing, without the overhead or complexity associated with previous mechanisms for dynamic content (such as CGI or scripts embedded in Web pages)(Figure 1).
The latest generation of these Web application servers combines the benefits of a robust, scalable application server, with the power and flexibility of the Java programming language. Custom Java objects, third-party class libraries, database connectivity (JDBC) drivers, and Java's bytecode-based Virtual Machine (JVM) allow developers to quickly build and deploy Web applications integrating information from a variety of databases and Web servers.
The Java application server market is very crowded now, with large companies like IBM (WebSphere) competing with smaller companies trying to carve out a niche for their products. However, one of the most popular Java application servers is Apache JServ, a freely available open-source product from the Java Apache Project. In this article, we describe how to install and configure JServ to provide systems developers in your organization with a full-featured, reliable, and scalable platform for developing Web-based, server-side applications.
JServ Features
Apache JServ provides most of the features you expect in a Java application server, including full support for the JavaSoft Servlet API (version 2.0). It also provides a number of tools for performance tuning (e.g., load balancing among multiple servers) and systems administration. The Java platform provides complete portability of the JServ engine and application servlets across a variety of UNIX and Windows platforms.
JServ is a standalone process that communicates with Apache Web servers through the Apache mod_jserv module. When a request for a Java servlet is received by the Web server, mod_jserv passes it to JServ where the request is processed and results sent back to the Web server. mod_jserv and JServ use the Apache JServ Protocol (AJP) to communicate. AJP is a network protocol providing support for very complex network environments (e.g., multiple Web servers connecting to many JServs on different servers).
Currently JServ only supports this integration with Apache Web servers (versions 1.2.x and 1.3.x). Since Apache Web servers can run on virtually all UNIX and Windows NT computers, this does not limit JServ deployment very much (although it does make it a more obvious application server choice for organizations already using Apache). In our organization, we have Apache Web servers running on Sun, IBM, and Intel-based servers. We have successfully deployed JServ on UNIX (Solaris 2.6) and Linux (RedHat 6.0) operating systems, and the following installation and configuration instructions are based on those experiences.
Installing JServ
JServ is completely written in Java (although mod_jserv is written in C using the Apache API), so it requires the JVM to run itself as well as application servlets. JServ requires a version 1.1-compliant JVM. The JVM comes in 1.1.x versions of the Java Development Kit (JDK), which also includes the core Java classes and some documentation. Sun provides free 1.1 JDKs for Solaris and Windows at:
http://java.sun.com/products/jdk/1.1/
Although Solaris comes with JDK pre-installed, we recommend updating to get the latest bug fixes and performance improvements. 1.1 JDKs for other operating systems, including Linux, are available from IBM:
http://www.ibm.com/java/jdk/
and Blackdown.org:
http://www.blackdown.org/
The official list of compatible Java ports is at:
http://java.sun.com/cgi-bin/java-ports.cgi
JDK installation instructions vary depending on the package. IBM's Linux port can be installed by simply unpacking a tar file. A common location for JDK is /usr/java, so we renamed the resulting jdk118 directory to /usr/java. On Solaris, you must use Sun's package management commands (e.g., pkgrm, pkgadd) to install software packages for the Java runtime environment, development environment, JIT compiler, man pages, and demo programs. They require that you remove the existing packages before upgrading with newer ones. Specific instructions for running the install commands are in the README file included in the tar file available from the Sun Web site. You must be root to run these commands.
The next step is to install the Java Servlet Development Kit (JSDK). This is available from Sun for any platform with a compliant JDK. Separate packages are available for Windows and UNIX (the UNIX version is labeled Solaris, but works on any UNIX or UNIX-like operating system, including Linux). The UNIX package is a tar file that can be unpacked wherever you put local packages. It includes a Java archive (.jar file), which you will reference from your JServ configuration. Note that 2.0 and 2.1 versions of JSDK are available. Current versions of Apache JServ (1.0 and 1.1 beta as of this writing) will only work with JSDK 2.0.
With JDK and JSDK in place, you are ready to install JServ. The installation package is available from the Java Apache Project (http://java.apache.org/). We assume you have installed and tested the Apache Web server. Based on your Apache installation, you must choose whether to compile mod_jserv into the Apache httpd program, or use the Dynamic Shared Object (DSO) support to load mod_jserv when the httpd program is started.
We recommend using DSO on your server. This eases building and upgrading JServ. If you are compiling mod_jserv into the Web server, you will have to run make install in the Apache source directory after making JServ. If the module is loaded dynamically, then the module shared object file (e.g., mod_jserv.so) can be easily moved, copied, or updated in the modules directory without affecting your Web server executable.
The JServ installation will build the necessary objects and (optionally) reconfigure your Web server. This is handled by specifying a number of options (detailed in the INSTALL file included with the JServ distribution), such as the location of your Apache and Java files. For example, here is the configure command we used on our Solaris development server:
configure \
--with-apache-install=/usr/local/src/apache_1.3.4 \
--prefix=/usr/local/jserv \
--enable-apache-conf \
--with-jdk-home=/usr/java \
--with-jsdk=/sunsoft/jsdk
Like other Apache products, the configure script will check the local development environment and build an appropriate Makefile. You will need an ANSI-C compiler (such as GCC) to compile mod_jserv. Running make install compiled mod_jserv, and copied it into the Apache modules directory, created the JServ .jar file, and installed it into the directory specified by the configure prefix option.
We could then copy the mod_jserv.so that was built to other Solaris servers, even if they had the Apache Web server installed in a different location or were running a different version of the Web server. We also installed the JServ servlet engine on other Solaris servers by copying the /usr/local/jserv directory.
On RedHat Linux, things can be simplified by downloading the RedHat Package Manager (.rpm) version of JServ. This can be installed using the rpm -i command. By default, this will install JServ into your existing Apache Web ServerRoot. If you want to use JServ with another instance of the Web server, you must move the JServ directories and restore the old Web server configuration.
Configuring JServ
The configure script enable-apache-conf option caused the make command to update the Apache httpd configuration files with the necessary directives to load the mod_jserv module. If you don't use that option, or are copying the JServ installation to another server, you must make these changes to the local Apache configuration. To enable mod_jserv, include LoadModule and AddModule directives that point to the modules directory where the mod_jserv shared object lives.
The JServ distribution contains an example directory with sample versions of all the configuration files. These samples include a description of each directive or parameter and many examples. This is the best documentation available for configuring JServ. The example jserv.conf file includes Apache JServ directives to include in httpd.conf. It is best to keep these directives in a separate file referenced by httpd.conf using the Apache Include directive. A few of the Apache JServ directives are described here:
ApJServManual -- This directive determines whether JServ is started automatically by the Apache Web server.
ApJServProperties -- The file containing configuration properties for the JServ engine, used to start JServ in automatic mode.
ApJServMount -- The mount point and name of a servlet zone.
For example, here are sample directives for a JServ, which is started automatically and processes servlet requests for production and test zones:
<IfModule mod_jserv.c>
ApJServManual off
ApJServProperties /usr/local/jserv/conf/jserv.properties
ApJServMount /Z-TEST /TestZone
ApJServMount /Z-PROD /ProdZone
</IfModule>
The ApJServMount directives map URLs to servlets. In our example, the URL http://localhost/Z-PROD/ExampleServlet will execute the servlet class named ExampleServlet in the ProdZone zone. The actual locations of the files or directories containing the classes in that zone are defined by the repositories parameter in the ProdZone properties file. For example, if the ProdZone.properties file contains:
repositories=/usr/local/jserv/servlets/Z-PROD
the URL would run:
/usr/local/jserv/servlets/Z-PROD/ExampleServlet.class
Specifying separate zones for test and production (and for separate applications) allows you to isolate servlets from each other. For example, if your servlets use the javax.servlet.http.HttpSession interface, user sessions will be unique for each zone. Also, recompiling a class in one zone will not reinitialize other zones. Each servlet zone runs its own class loader, keeping them from accessing data in other servlet zones. However, note that all zones are run in the same JVM with the same user and group IDs and permissions, without any kind of security sandbox like applets have. This means all the servlets have equal access to resources outside of JServ and the JVM. If this is unacceptable in your environment, you can run separate JServs for different zones.
The ApJServProperties file contains properties that define the classpath and environment for the JVM and the properties files for the servlet zones. If the Web server is using the Apache JServ directives in the example above, then the jserv.properties file might look like this:
# wrapper parameters
wrapper.bin=/usr/java/bin/java
wrapper.classpath=/usr/java/lib/classes.zip
wrapper.classpath=/usr/local/jserv/lib/ApacheJServ.jar
wrapper.classpath=/sunsoft/jsdk/lib/jsdk.jar
wrapper.classpath=/sunsoft/jdbc/lib/classes111.zip
wrapper.classpath=/sunsoft/jdbc/mysql.jar
wrapper.env=ORACLE_HOME=/oracle
wrapper.env=ORACLE_SID=ENTDB
wrapper.env=LD_LIBRARY_PATH=/sunsoft/jdbc/lib
# servlet zones
zones=TestZone,ProdZone
TestZone.properties=/usr/local/jserv/conf/TestZone.properties
ProdZone.properties=/usr/local/jserv/conf/ProdZone.properties
The wrapper parameters must point to the JVM (wrapper.bin) and each class (wrapper.classpath) used by JServ, including JSDK and the JServ engine. Note the JDBC and Oracle wrapper.classpath and wrapper.env paramaters. This example is using JDBC drivers to connect servlets to an enterprise Oracle database and a local MySQL database (Figure 2). JDBC is a very powerful tool for developing applications integrating data from disparate sources. A list of JDBC drivers for a wide variety of databases is available at:
http://java.sun.com/products/jdbc/drivers.html
Many are native-protocol fully Java technology-enabled drivers, meaning that they do not require any middleware or client libraries on the JServ machine to connect to the database server. These drivers are installed the same way the JSDK was installed: simply copy a Java archive file into a known location which is referenced in the wrapper.classpath parameters.
The servlet zone properties files define the configuration for each zone. This includes, at a minimum, a repositories parameter like the ProdZone.properties example we looked at when discussing the ApJServMount directive. The repositories parameter can include a list of directories and Java archive files that contain the zone's servlet classes. Other parameters control the zone's class loader, session management, and servlet configuration. For example, the timeout value for an unused user session can be changed from the default of 30 minutes or timeouts can be disabled altogether. These properties are documented in the example/example.properties file.
Running JServ
The wrapper parameters in the JServ properties file allow the Apache Web server to start a JServ automatically. This is often the most convenient way to start JServ, particularly in a production environment where you always want the JServ running. However, when first configuring JServ, it is easier to start and stop JServ manually to test configuration without bouncing your Web server. Also, some advanced features, such as load balancing, require one or more standalone JServ processes that are started manually.
To manually start JServ, turn off the Web server's automatic startup (i.e., set ApJServManual on in the jserv.conf file) and write a little script that defines the CLASSPATH (much like the JServ wrapper properties) and starts up the JVM. Here is a sample startup script, which starts JServ with the same environment as the wrapper parameters above:
#!/bin/sh
# Launch jserv in manual mode.
LD_LIBRARY_PATH="/sunsoft/jdbc/lib"
export LD_LIBRARY_PATH
ORACLE_HOME=/prod/orahome
ORACLE_SID=ENTDB
export ORACLE_HOME ORACLE_SID
jdk=/usr/java/lib/classes.zip
jsdk=/sunsoft/jsdk/lib/jsdk.jar
jserv=/usr/local/jserv/lib/ApacheJServ.jar
oradb=/sunsoft/jdbc/lib/classes111.zip
mysql=/sunsoft/jdbc/mysql.jar
props=/usr/local/jserv/conf/jserv.properties
classes=$CLASSPATH:$jdk:$jsdk:$jserv:$oradb:$mysql
/usr/java/bin/java -classpath $classes \
org.apache.jserv.JServ $props &
Use the UNIX kill command to stop the JServ Java process.
Administering JServ
Once JServ is running, the systems administrator needs to test and monitor it. JServ includes a built-in servlet to display the current status of both mod_jserv and the servlet engine. To enable the status handler, you must allow JServ to run itself as a servlet by specifying the following in the JServ properties file:
security.selfservlet=true
To access this servlet, include the SetHandler directive in your Web server configuration file to map a Location to the jserv-status servlet. For example, this configuration stanza allows you to use a URL like:
http://www.your.org/status/jserv/
to view the status pages:
<Location /status/jserv/>
SetHandler jserv-status
order deny,allow
deny from all
allow from your.org
</Location>
It is important to restrict the Location to trusted sites, since the status pages may give users the ability to gather important system information. This configuration allows you to immediately test whether the basic JServ service is working. The status pages list the various pieces that have been installed and the parameters that have been set, so it can be used to verify your configuration.
JServ also includes two logging facilities, one for mod_jserv and another for the servlet engine. Several changes in the configuration options were made in JServ 1.1, so review the example files for your version to get the details.
The mod_jserv log file is defined in the Apache Web server configuration with the ApJServLogFile directive:
ApJServLogFile /usr/local/apache/logs/mod_jserv.log
mod_jserv logs a moderate amount of information about startup and shutdown, problems connecting to the servlet engine, and servlet errors that are returned by the servlet engine. Alternatively, you can mark the log file as DISABLED and mod_jserv will redirect its messages to the Apache error log file. In version 1.1, you have additional control over the level of mod_jserv logging with the ApJServLogLevel directive.
The logging and tracing options for the servlet engine are specified in the JServ properties file, and examples and documentation are found in the example/jserv.properties file. The servlet engine log file can be made quite verbose by using a variety of logging and tracing options. When first configuring JServ, it is useful to turn on all of these options, but on a production server, this can quickly result in a very large log with every action traced. Logging is also an expensive operation in terms of performance and should be disabled (or limited to exceptions) on any production server where performance is an issue.
Logging is not the only feature that can be tuned for performance. Just as you must balance the need for status and load information with the cost of logging, tuning for performance often requires compromises in other areas. Here are some trade-offs to consider; a more detailed discussion can be found in a performance paper available from the Java Apache Project at:
http://java.apache.org/jserv/papers/performance.pdf
Multithreading
By default, JServ will store only one instance of each servlet and use multiple threads to handle concurrent requests. While this improves performance (even on single-processor servers), care must be taken in servlet design to avoid multiple threads corrupting shared resources. If servlets cannot be designed this way, they must implement the SingleThreadModel interface to force JServ to store multiple instances to handle multiple requests, losing the performance benefit of multithreading. Also, if supported on your operating system, always use a JVM with native thread support rather than green threads (virtual threads on a single-threaded OS).
Authentication
JServ provides an authentication mechanism to verify that only trusted Web servers and servlets are connected. This mechanism requires a challenge and response procedure that takes an additional round-trip over the network between the two services. This delay can be eliminated by turning off the authentication mechanism and relying on connection IP filtering and other external network protections. (Note that network delay between Web servers and servlets can be eliminated altogether if they are placed on the same server.)
Class Loading and Caching
Servlets can be pre-loaded at JServ startup to avoid any latency that would occur on the first request to uninitialized servlets. This is configurable in the zone properties file. Also, if a servlet class is changed after it has been loaded, JServ will detect the change and reload it. For consistency, JServ must reinitialize the entire zone, including reloading all other classes in that zone. Auto-reloading can severly degrade performance while the zone is being reinitialized. Separating applications and test environments in different zones can minimize this effect.
If, after performance tuning, your server is simply unable to handle the load, you can run multiple JServs and let mod_jserv balance the load across the servlet engines. The JServs can run on the same server or be distributed across multiple servers. Running multiple JServs on a single server can still help by distributing the threads across multiple JVMs. To run multiple instances of JServ on the same machine, the bindaddress and port parameters in the JServ properties file must be unique for each instance.
The features of load balancing can also be used to add some fault tolerance to your JServ implementation, since mod_jserv will redirect requests to other servers if one JServ fails. Also, some of these features can be used to set up complex networked environments where certain zones are hosted on distributed hosts. These topics are explored in a how-to document available from the Java Apache Project:
http://java.apache.org/jserv/howto.load-balancing.html
Load balancing is enabled in the Apache configuration for mod_jserv. Suppose you have two instances of JServ on two different hosts. The following directives will balance requests for the Z-PROD zone between the two servers:
# use manual startup for load-balancing configurations
ApJServManual on
# Set the mount point and load balancing on a zone: PRODZone
ApJServMount /Z-PROD balance://set1/PRODZone
# set each jserv weight, default 1
ApJServBalance set1 SUNSVR
ApJServBalance set1 LNXSVR
# specify jserv protocol and connection host and port
ApJServHost SUNSVR ajpv11://jserv1.your.org:8007
ApJServHost LNXSVR ajpv12://jserv2.your.org:8007
# define a unique session cookie suffix for each jserv
ApJServRoute JS1 SUNSVR
ApJServRoute JS2 LNXSVR
Balancing is done by randomly selecting a server from the set of hosts for each request. A weight can be added to have some hosts process more requests than others. For example, if we want LNXSVR to handle two thirds of the requests, we can add a weight of two to its balance:
ApJServBalance set1 LNXSVR 2
Since the requests are routed in a random manner (rather than in a round robin fashion) the actual load may not be exactly balanced. However, the larger the number of requests, the more balanced the load will be.
In the example above, the two hosts are actually running different versions of JServ. SUNSVR is running version 1.0, and LNXSVR is running 1.1b3. These two versions of JServ use different versions of the AJP. Therefore, we must specify the protocol for mod_jserv to use when talking to the two hosts. This is done in the ApJServHost directive, where we also specify hostname and port number as a URL. Note that mod_jserv must be able to use the latest version of AJP; you cannot force a 1.1 JServ to use the older protocol.
Our example also demonstrates how to force a session to be bound to a particular JServ. Since user sessions contain state information used by servlets, the same JServ must process all requests for that session. Sessions are identified by session cookies set by servlets. The ApJServRoute directive tells mod_jserv to look for a JServ id at the end of the session cookie to determine which JServ owns the session. If there is no JServ id, mod_jserv will select the JServ based on its usual (random+weight) mechanism. It will pass the selected server's id to the JServ, which will append it to the cookie when it is passed back to the Web browser. Since the JServ id is in the cookie, it will be available for the next request even if it is processed by a different Web server (Figure 3).
JServ includes both authentication and IP filtering to ensure that only trusted mod_jservs and servlet engines can connect. These features are particularly important when you have a distributed JServ implementation as in our example. IP filtering is done by the servlet engine when accepting a connection from a Web server. The security.allowedAddresses property must include all IP addresses for any Web server allowed to connect to it.
Authentication is also enabled in the JServ properties file. The following properties turn on authentication and specify the location of a key that is used by the Web server and servlet engine to authenticate:
security.authentication=true
security.secretKey=/usr/local/jserv/conf/jserv.secret.key
The secret key is simply an arbitrary text file. A key file on the Web server, with the exact same contents as the JServs' key files, must be specified in its configuration file:
ApJServSecretKey /usr/local/apache/conf/jserv.secret.key
When a request is passed to the JServ, it will challenge the Web server for the key. The Web server then encrypts the key (using MD5) and passes it to the JServ. To ensure security of the key on the servers, the Web server host's jserv.secret.key file should only be readable by the Apache httpd process, and the servlet engine's file should only be readable by the JServ Java process.
Future Developments
As this article is being written, it appears that JServ will be rolled into the new Apache Jakarta Project:
http://jakarta.apache.org/
JServ was developed as an independent implementation of the Javasoft servlet specification. The Jakarta Project is merging the work that Sun did on the original reference implementation of the spec with the work being done on JServ. The result will be the official reference implementation for JSDK 2.1 (and subsequent versions) and, although the project is supported by the Apache Software Foundation under its open source rules, the servlet engine will not be limited to Apache Web servers.
However, The Apache JServ Future Roadmap:
http://java.apache.org/jserv/future/
indicates that development of JServ will continue, at least for one more major release. Important improvements that should be included in that release include socket and thread recycling, AJP version 2.1, and support for JSDK 2.1. There may also be improved status pages with current load and dynamic status information.
About the Author
Don Gourley and Wei Wu work at the Washington Research Library Consortium where they develop and administer Web-based information systems for academic libraries. Don can be reached at: gourley@wrlc.org. Wei can be reached at: weiwu@wrlc.org.
|