Cover V06, I01
Figure 1
Figure 2
Figure 3
Listing 1
Listing 2
Listing 3


Eschewing The Internet for External Data Sharing

Jonathan Feldman

If you have a need to automate external data transfers because you're tired of shuffling tapes, but you shudder every time you're invited to a vendor's free-donut seminar that promises "secure" Internet transactions, perhaps it's time to get back to basics. If your bandwidth needs are modest and you're dealing with a security-conscious agency like a bank, you might want to take a break from the all-digital network and reacquaint yourself with your old reliable pals: the serial port on a UNIX host, a modestly capable modem, and a telephone line.

In some cases, the decision about how to automate processes may already be made for you. Some agencies, especially banks, are understandably squeamish about the glory of the Internet and are hesitant to transfer sensitive information over this high-profile and thus very targeted network.

In our case, we have many departments that need some sort of external data on a monthly basis. One department has been schlepping 9-track tapes over to us for years, running a script at one of our system consoles, then mailing the tapes back to the vendor. Argh!

Another department does have an automated online transfer via modem, but it involves ftp-ing the file from the host to a specific user's PC, then calling a ProComm script to do the transfer from the PC's modem. Additional (and nondesirable) points of failure include the network, the PC and its configuration, the local modem, and of course the user. (See Figure 1.)

As more of our departments displayed the need to transfer data to and from the outside, it became clear that we needed to do this from the Unix host itself, which is a critical machine; great care is taken to assure its uptime. Also, because the databases run on the UNIX host, doing the transfers on that host made the most sense (Figure 2).

Our goal was to give users a simple external command that would initiate a call to the outside, then send or retrieve data via a modem on the Unix host, utilizing any number of transfer protocols. We tried several complex ways before settling on a very satisfactory and simple solution.

Unnecessarily CommPlex

Because we already had transactions happening via ProComm on a DOS PC, we reasoned that a UNIX work-alike might be our best bet. (Wrong! I'll show why in a minute.) We investigated many of the tty-based, freely available communication packages for UNIX, including xcomm, kermit, and pcomm.

All the packages worked pretty well doing their dial-out thing. However, pcomm was the only one that supported all of the protocols we needed, and I was having trouble porting the IPC portion of it to our Dynix/ptx system. (This was either because I am STREAMS-disabled, or because the implementation was a bit different than usual.) While I was banging my head against this code, a co-worker poked his head into my office and gently asked if we really needed a full-featured terminal program for our application. After all, weren't we trying to avoid user intervention in the first place?

He was right, of course. After that, I modified a friend's modem initialization program to be an ad hoc dialer, which hooks stdin and stdout to the modem device and calls a shell script to deal with talking to the modem (Listing 1). This small and simple solution allows multiple command scripts, all of which can dial a number, listen for prompts, respond to them, and call any external file transfer program that may be necessary. We use the sz/rz package, but you can use any package you like. You can find some of the packages we tried in:

At first, we wrestled with the vendor-supplied cu/uucp package as a dialer, because it seemed like the "right" thing to do. But when it became a choice of fighting the cu-versus-the-external-file-transfer-program battle for a couple of hours versus 20 minutes of shell scripting on top of the simple dialer program, the correct choice became very clear.

Linux's SLIP/PPP chat program probably would have worked - if we had ported the source code, but by this time I was a little tired of trying to port large amounts of source code to Dynix/ptx. The simple dialer program is very portable and relies on one of the reliable abstractions of the UNIX world: everything's a file. You can get it and a sample script from:

How It Works

Your modem port, like most other devices, is referred to by its device node in the /dev directory, which is simply a filesystem-oriented way of referring to its device driver entry points in the kernel. For example, on my Linux system, when you type:

# ls -la /dev/ttyS1

you might see:

crw--w--w-   1 root  root   4,  65 Sep 22 15:59 /dev/ttyS1

Notice that there is no file size. Instead, there are two numbers, the "major" number and the "minor" number, in this case, 4 and 65. 4 is the major number, or the primary place in the kernel where this code resides (just think of it as a placeholder), and 65 is the minor number, in other words, precisely which device of this kind of devices. If you look at the other Linux serial port, you might see:

crw--w--w-   1 root  root   4,  64 \
Sep  9 13:23 /dev/ttyS0

Ah ha! The 65 has turned into a 64. This is how the driver differentiates between the two physical devices that it handles. So, when we write out to this "file," the kernel intercepts the write, and then hands it off to the major device driver. The device driver then decides specifically which physical device (minor) should get the write.

We now know that the modem is controlled via some black magic having to do with kernel entry points. This is fantastic academic knowledge, but what does it do for us as administrators and troubleshooters? It tells us that we either need to rely on the kindness of C programmers, or learn something about C programming ourselves if we want to muck about with device ports in any serious way, because all we can do to this device port is write to it and read from it.

So, for this application, why couldn't we just write a Bourne shell script like the following? (It puts the standard output between the parens TO the modem, and feeds the stuff between the parentheses FROM the modem.)

echo "ATDT18005551212"
sleep 2
sz myfile.txt
sleep 2
) >/dev/ttyS1 </dev/ttyS1
# reads and writes from the whole subshell (in parens)

This will work on some UNIX variants due to a special treatment of device names (see below). But, on others, it won't work because a modem port will not open to read or write unless one of the following two conditions is true:

1. Carrier detect is present

2. The port has been opened with the special O_NDELAY flag

Hmm, that #2 doesn't sound like something you can do from a shell script. It isn't. What to do? We can't have CD present when we're dialing out, because that would imply a busy line!

If you are fortunate enough to have one of the more modern UNIX variants that are friendlier to modem shell-scripters, you can get around condition #1 and #2, because your modem port has two names; a dial-out name and a dial-in name. The dial-in name is typically the normal name: ttyS1 in Linux, tty1a for SCO, or ttyGA/GAAM if you live in the Dynix world. The dial-out name varies between implementations: in Linux, for example, it is cua1, (take off the tty and substitute cua), and in SCO it is tty1A (capitalize the last letter).

The reason for these different dial-out ports is that they specify different major/minor pairs, which is the only way (without using C) that you can tell the port driver that you don't want it to wait for carrier, that you need to dial out, and thus have no carrier to begin with! For example, Linux's /dev/cua0 port looks like this:

crw-rw----  1 root   uucp    5,  64 Jul 17  1994 /dev/cua0

which has a different major number than the /dev/ttyS0, listed above.

Notice that Dynix does not have a dial-out device. If your UNIX doesn't either, you'll have to use the dialer code in Listing 1 to call your shell script, which should look very much like the example above, except that it is called from the dialer. See the example in Listing 2.

The dialer program is pretty simple. It opens the port with the O_NDELAY flag, which means "don't worry about whether you have carrier on that port or not, just open it!" The program also sets various ioctl (I/O control) parameters, including the bps rate. (You can set some of these parameters with stty, but as long as you are using a C wrapper, you might as well set them here.) Note how the program reads the terminal's control structures and then modifies them.

The dialer then closes its standard input, and reassigns it to the modem device that you've opened. Your shell script (which is a child process of the dialer program) can then start talking to the modem just by using echo and so forth. The same thing is done to the standard output so that your script can "read" standard input, which will be the output of the modem (Figure 3).

One warning: You can choose to set the dialer suid to uucp so that the dialer can open and dial the modem (in which case, you must make it executable only by a group trusted to use the modem). Or, you can change the modem port's group to a group trusted to use the modem. You will have to decide which is less of a security problem for you. The dialer program, after opening the port, switches back to the real users's uid, so the security risk should be minimal. Nonetheless, please be careful when assigning permissions to both the dialer and the shell script. These should not be writable by anyone except root. Don't be overly alarmed, though. This kind of security concern can and should arise with any type of user dial-out.

Note that if you want to share the line with a dial-in port, you'll have to implement uucp-style locking and use a "shareable" getty-like uugetty. There's a very good example of how to do this with the pcomm source code. In our case; we left the modem dial-out only, and disabled it in the /etc/inittab. You could also compile and use the freeware chat program, included with some Linux distributions, which is normally used for PPP connections. chat does more or less the same thing as the dialer program, but may also have more features or be more complex than you need.

The Home Stretch

After you finish configuring your system, you'll be able to supply users with one (see bottom of Listing 3) command line to run the automated data transfer. They can call the command, see if the external file is where it ought to be, do some sanity checking, and then integrate it into the database!

We have been running this setup for about six months now, with no problems whatsoever. Most of the setup time was in debugging the programs we tried. A month after setup, another database programmer asked to join our ever-growing horde of dialers, and we were up and running very quickly. The setup seems to replicate rather well, and since the dialer program is pretty small, the impact on system resources is also very low.

About the Author

Jonathan Feldman works with UNIX, Netware, and a whole football team of networking gear at the Chatham County Government in Savannah, GA. He likes to keep things simple so that even he can understand them. In his spare time, he writes, continues to figure out how to play guitar with his feet, and sings Barney songs to David Byrne tunes with his son, Leo.