Cover V10, I13
Article
Figure 1
Figure 2

jun_sup2001.tar


An Introduction to IBM's Network Dispatcher

Mike Wilcox

The Internet has brought us many new technologies as well as new requirements for the Web-based application. One urgent need is for high availability and rapid scalabilty of the infrastructure supporting our Web-based application. To meet this need, I am going to discuss "client spraying". In a basic setup, an HTTP client will connect to one HTTP server. However, if this server is down or is busy, there is not a backup server for the client to connect to. Although there are other ways to correct this problem, the method used by IBM's Network Dispatcher is called "client spraying".

The idea behind client spraying is to create a pool of HTTP servers and allow a client to connect to any one of the available servers. Because the Network Dispatcher only allows connections to available servers, high availability can be achieved. By adding more servers to the pool and then using advisors (which I will address later), the Network Dispatcher can load balance among the available servers. The ease of adding servers combined with the load balancing provides a powerful architecture to scale our application's server resources.

My goal in this article is to introduce the technology and then, via an example, elaborate on how the Network Dispatcher works. I will start with a discussion of how load balancing and high availability of the servers is achieved using Network Dispatcher's advisors and metrics. I will then discuss how high availability is built into the Network Dispatcher, so that the Network Dispatcher is not a single point of failure.

I'll begin with a simple example of one machine running Network Dispatcher (ND1) and two servers (HTTP1 and HTTP2) running our application. See Figure 1. Our Network Dispatcher cluster address is 9.9.10.1 (www.mike.com), and we are spraying port 80. From the client's perspective, the client makes a request to a specific IP address and port and gets a response for a server, in this example HTTP2. There are two critical concepts here: one is that the client does not know that Network Dispatcher is involved, and the second is that the response from the server (in our example, HTTP 2) is sent directly back to the client and bypasses ND1. The second point is subtle, but important. From a performance standpoint, we do not want all the traffic to have to go from the server back through ND1; and if we consider the typical HTTP-based application flow, the client will send a small amount of data to ND1 (e.g., a URL and page request). The HTTP server can send large amounts of data back to the client (e.g., to send an HTML page with many pictures).

Network Dispatcher Advisors

The next time the client makes a request, it may get connected either to HTTP1 or HTTP2, depending upon what ND1 chooses. To determine the availability of a server, ND1 will be running what are called Network Dispatcher Advisors. These advisors are Java programs that run on ND1. Advisors that come with the product include: ping, socket, SSL, telnet, ftp, NNTP, SMTP, POP3, IMAP, and HTTP). In this example, I will use an HTTP advisor on ND1, which sends an HTTP "Get request" to each server, and then time how long each server takes to get a response back to ND1. The server with the shortest response time is marked the best, while the next shortest response time is marked second, and so on until all the servers have been ordered. In this example, only two servers are involved, so either HTTP1 or HTTP2 will be marked as the "best."

If a server does not respond in specific period of time (e.g., if the server is down), then the HTTP advisor marks the server down and ND1 will not dispatch any more clients to that node. If the advisor code determines that the HTTP Get response was unsatisfactory (e.g., if "Error 404" occurred in response to the HTTP Get), then the HTTP advisor could mark the server down as well. Another key feature of Network Dispatcher is that it handles server failure by routing around the failed server, and we have a very powerful tool to use to decide when a server is actually down. The default HTTP Advisor in ND will only mark a server down if the server response times out, but you can easily create your own advisor. The appendix of the Network Dispatcher User's Guide contains the source code for an HTTP Advisor, and you can easily modify the advisor to look for the actual response from the HTTP Get.

This approach to determining whether a server is up or down has been described as the "client's eye view of the application", because we're looking for a response from the server that indicates that the application is available for useful work from the client's perspective. The client may have to traverse many layers to get to usable data. For example, the client may have to connect to the HTTP server first and then on to the application server, which will then make a call to a database. If any of these layers has a problem on the server being tested, then the advisor should mark the server as down. We then have the ability to dispatch around a "bad" server with either the supplied advisor or our own custom "client's eye" advisor.

Which Is the Best?

I will now discuss how Network Dispatcher selects the "best" server for the client to connect to. If we can select the best server on a regular basis, we can then "load balance" across our servers. There are actually four components, I will call them metrics, used to determine the "best" server. These metrics are:

1. The number of active connections to the server

2. The number of connections sent to the server recently

3. The advisor's response

4. The Interactive Session Support (ISS) agent input

These four metrics can be weighted in any fashion. For example, 30%, 30%, 30%, and 10 % would indicate that ISS input is not as heavily weighted as the other three metrics. The first metric is simply a count of the connections, while the second metric is a count of the number of connections during the last window. The window is a value that can be changed, but changing the window length is generally not required. The third metric was discussed above, while the fourth metric is a new topic. ISS agents provide statistics based on system load as derived from CPU load or free memory, or you can write a custom value. Typically, the best weighting of the metrics is 30% or greater for active connections and 30% for recent connections, with the remaining 40% falling on ISS and advisor input.

Network Dispatcher is composed of two main programs: the executor and the ndserver. The executor is responsible for reading the TCP header from the client, doing a lookup in a table to find the correct server to spray to, and then dispatching the client packet to a server. Because every client packet is sent to the executor, the executor is a key part in determining the Network Dispatcher's performance. The "in memory connection" table in combination with the executor running in kernel space allows the Network Dispatcher to be very efficient in "spraying" a client request. With the ndserver handling all the other activities (particularly the important task of updating the "in-memory connection" table), the executor is free to handle only the few tasks mentioned above.

The last issue involves what happens with Network Dispatcher software crashes. The answer lies in using Network Dispatcher's High Availability option, which comes free with the product. With the HA option, you run Network Dispatcher code on two separate machines. One machine acts as the primary (ND1-primary) and the other machine acts as a backup (ND1-backup). See Figure 2. The cluster status is communicated between the machines via heartbeats. Then, when the primary machine dies, the backup Network Dispatcher takes over.

One of the keys to the success of the failover is the passing of the "in-memory" connection table from the primary Network Dispatcher to the backup Network Dispatcher. The table is actually passed via the heartbeats. With the "in-memory" connection information on the backup node, items like the stickiness setting are maintained after failover. Mutual high availability is supported with the latest version of Network Dispatcher (version 3). Mutual high availability allows you to have one ND as the primary ND for one cluster address and as the backup ND for a second cluster address. Therefore, you utilize both machines for dispatching. By sending out ping-like requests to a third machine, typically a high-profile router, the network dispatcher can also detect a network card failure. Thus, in the event that a network adapter fails on the primary (ND1-primary), the "reach machine" will not be accessible, and a failover to the backup network dispatcher (ND1-backup) will occur.

Network Dispatcher also supports "co-location", which allows you to run your application on the same machine as the Network Dispatcher. Thus, in the previous example, we could run the Network Dispatcher (ND1) on the server node HTTP1. If we run the backup Network Dispatcher on HTTP2, we have reduced the setup to only two nodes. However, we have a highly available application, because any one node could crash and we would survive. We also have load balanced our client connections and, if we need more servers, we simply add them to our Network Dispatcher cluster.

Conclusion

I have provided a brief introduction into the features of the Network Dispatcher. In addition to the key points of high availability, load balancing, and scaling, remember that the Network Dispatcher will dispatch any port, not just port 80 (the example port). With this in mind, many applications can use the Network Dispatcher. For example, Oracle's financial application (Oracle 10 or 11), routers, firewalls, and so forth. Some barriers may come to mind as you think through the potential uses for Network Dispatcher, but I have only discussed a few of the features here. In the future, I may address additional features such as content-based routing, wide-area dispatching, dynamic DNS, IP-rules, and Web traffic express. Network Dispatcher can run on Windows NT and 2000, AIX, Solaris, and Linux. Good luck, and start the sprayers.

Mike Wilcox has worked with AIX systems and TCP networking for the past 11 years. The majority of this time has been as a consultant to fortune 500 companies, building the infrastructure for applications ranging from ERP applications to "Web browser-enabled applications". Mike is a senior member of IEEE and has been certified in HACMP, SP, TSM on AIX (a.k.a. ADSM), and multiple AIX certifications. His most recent project was to build a highly available e-business infrastructure composed of multiple backend platforms using WebSphere Application Server, WebSphere Performance Pack, IBM's content manager, MQ, DB2, and AIX. Mike can be contacted at: mwilcox1@ieee.org.