Cover V07, I01
Article
Listing 1
Listing 2

jan98.tar


SOCKSLIB.PL - A Library for Using SOCKS Firewalls

Matt Ganis

If you're a system administrator you've probably been asked to "help support just a few more machines." That's usually not too bad; most of your scripts and tools can easily be expanded to support "a few more," but often people don't realize that servers on the Internet (outside your corporate firewall) are different to manage from those on the inside. Firewalls are put in place to make it difficult to break into your network, but at the same time they sometimes make it difficult for you on the inside to get out. SOCKSLIB.PL is a Perl library of functions that allow the creator of Perl programs to issue just one call, rconnect(), to make a connection to a networked machine, whether from inside or outside the firewall.

SOCKS has been around since about 1994. It was originally conceived by David Koblas while at MIPS Computer Systems and has gained a wide acceptance within the Internet community. SOCKSLIB currently supports version 4 of the SOCKS protocol. Although there is currently an effort to standardize the protocol (version 5) within the IETF RFC process, most SOCKS firewalls are running version 4.

Making a Network Connection

Anyone who has written a networked application in Perl (or C, for that matter) knows there are several steps involved in setting up the network connection (in this case a TCP session from one machine to another). Specifically they are:

  1. Convert the hostname to an IP address.
  2. Obtain the system protocol number (ie, tcp, udp, etc.).
  3. Create a socket of the desired type. (In SOCKSLIB's case, I always create a TCP-based socket within the AF_INET domain or within the Internet/tcp/ip domain.)
  4. Build a data structure that contains the address family, port, and address of the machine with which you wish to talk.
  5. Issue the connect (over the newly created socket) using the data structure to indicate which machine to talk to and how to talk.
This may sound confusing (and it can be if you're not sure what you're doing). Since most applications use the reliable tcp protocol (as opposed to the unreliable udp protocol), I've written the SOCKSLIB library under the assumption that all connections are tcp-based.

Making a tcp connection through a SOCKS firewall is slightly more complicated. First, before making the connection to the foreign host, the client connects via tcp to the SOCKS firewall. Using the protocol defined in the SOCKS 4 specification, the client gives the firewall the ip address and port number of the remote host with which it wishes to communicate. The SOCKS firewall makes the connection (on behalf of the client) and then acts as relay between the two (receiving data from the client and passing it onto the foreign host and vice versa). Also, as part of the protocol, a user name is passed along in the session setup (for tracking purposes on the server). For this library, I've hardcoded the user name to be Perl. It can be changed in the rconnect() call at the statement $bytes3 = pack("C5",80,101,114,108,00). If you do the decimal-to-ascii conversion, you'll find that's P-e-r-l with a 0 terminator.

I wanted to avoid having to know ahead of time whether or not a given host was outside of the firewall (requiring the use of SOCKS) or inside the firewall, so I combined all of the network calls (and the determination of the status of a given host) within the call to rconnect(). Since you don't need to know if your connection is going through a firewall or not, you just issue the rconnect() to a given host, and the socket file handler, SOCKS_SOCKET, is set up to talk to the remote host. Whether it goes through a firewall is immaterial to the programmer.

An Example

Listing 1, webcheck.pl, is a small program used to check some of the Web servers that I use (and support) to ensure they are working correctly. Basically I needed a program that connected to a Web server and returned a predetermined page of text. If the amount of text returned matches the expected result, then I assume the Web server is working fine. Note that you could connect to port 80 (the default Web port) and issue a get command. But if you don't know how much data should be returned, you don't know if you're getting back a 404 (and its associated error text) or a real page.

Setup

Within the webcheck.pl program, the servers to check are added to the array webservers[]. There is no distinction between being inside or outside the firewall on the Internet. This makes managing servers relatively transparent to the system administrator or support staff or other users that you supply with this library.

Also, besides the use of the rconnect() call, you should include a call to the SOCKSinit() routine to set up the global variables needed for the routines. However, it's not mandatory to issue the call. A check is made in the rconnect() routine, and if the global variable SOCKS_GW isn't set, the SOCKSinit() routine gets called anyway.

The SOCKSLIB library is intended to be installed into a public library. (I install public libraries in /usr/local/lib.) Once you install the library you'll need to make some modifications to the library to fit your site. All modifications need to be made in the SOCKSinit() routine where all of the global variables are defined.

The first of these is the SOCKS gateway that you want your customers to use. Simply set the global variable $SOCKS_GW to either the name or the IP address of your firewall. The getIPaddress() routine is smart enough to know the difference between an address and a fully qualified name.

The second set of "variables" is an array of networks, SOCKS_DIRECTS, that are local subnets within your site. In the rconnect() routine, a call is made to the routine CheckDirect(). This routine compares the IP address of the destination to which you want to connect to the "addresses" in the SOCKS_DIRECTS array. If a match is found, the tcp connection is made directly to the destination, bypassing the firewall. The determination of a direct host is made by comparing the IP addresses from the most significant byte to the least, where zeros indicate to match all. So, comparing a destination of 15.23.192.15 with an entry of 15.0.0.0, 15.23.0.0, or just 15 would cause a direct connection to be made.

Summary

The routines in the SOCKSLIB.PL package, though simple, can be quite useful (even if you're not using the routines to connect through a firewall). In summary, the relevant routines are:

rconnect( address, port) - where address is either a hostname or IP address and port is a valid port number to connect. Returns with the global variable SOCKS_SOCKET set for communication. Returns:

-1:	could not create a socket to the firewall
-2:	could not connect to the firewall
-10:	could not create a socket to direct host
-20:	could not connect to a direct host
90:	Successful socks transfer
91:	socks server could not connect to a host

getIPaddress( address ) - returns an array of four elements representing four octets of an IP address. Note that address can be either a hostname or a valid TCP/IP address.

CheckDirect( address ) - checks if a given address is an internal or external address.

Returns:

0:	Network is defined as internal
1:	Network is defined as external

About the Author

Matt Ganis currently works for IBM in Harrison, New York. In his spare time, he can be found teaching astronomy at Pace University in Pleasantville, New York. He can be contacted via email at ganis@vnet.ibm.com.