Managing Dial-Up Services
The need for dial-up computing first appeared at Texas Instruments in Houston about 15 years ago with the onset of our first distributed computing environment, the Apollo computer. The system administrators did not want to continue driving back to work in the middle of the night when problems arose. Because computing was now distributed, the glass-house concept did not apply, and the budget no longer allowed for 24-hour a day, on-site operators. There was no other choice. Today dial-up services are an integral part of doing business. With more than 200 laptop computers in this local facility alone, the demand for dial-up has increased explosively and a way to manage the dial-up has become a necessity.
The Origins of Serial Line Protocols
In the TCP/IP world, serial lines are used to create wide area networks (WANs). Unfortunately, a standard physical layer protocol for serial lines has not always existed. This lead to the emergence of at least two protocols: Serial Line IP (SLIP) and Point-to-Point Protocol (PPP). As the cost of high-speed modems and computers dropped, the prospect of running IP over serial lines into the home became feasible.
SLIP, which was created first, originated in the 3COM UNET TCP/IP implementation from the early 1980s. SLIP is a very simple protocol and easily implemented. It also runs on just about everything. Around 1984, SLIP was implemented for 4.2 Berkeley Unix and Sun Workstations. It quickly became the standard for connecting TCP/IP routers and hosts with serial lines. With its simplicity came a lack of some basic functionality that may make it undesirable for dial-up use.
To address the shortcomings of SLIP, Point-to-Point Protocol (PPP) was developed as an Internet standard. PPP is more robust than SLIP, and it handles dynamic IP address allocations. While PPP is more difficult to implement and is not as readily available, its enhancements and robustness may make it the more desirable protocol for dial-up use.
Microsoft also jumped into the fray with its own proprietary protocol scheme, Remote Access Services (RAS). RAS is essentially a superset of PPP that incorporates NETBUI and NETBIOS connectivity between NT servers and clients. RAS is extremely simple to set up and use because of Microsoft's integration efforts. RAS is targeted at x86 platforms running one of Microsoft's operating systems.
Selecting a Protocol
So, which protocol is best and which one should be used? Consider the following:
It depends on the makeup of the client base.
Pure Unix environments don't need RAS.
If all servers are Unix, then RAS isn't needed.
If some of the clients can't run PPP, then SLIP is needed.
If there are poor quality phone lines, then PPP may work better.
The most important factor is that multiple protocols will need to be supported.
Selecting a Terminal Server
The heart of a dial-up service is the terminal server. There are many different flavors of terminal servers. Most of these devices are enhanced x86 engines with specialized operating systems tuned to handle routing multiple serial lines onto a network.
Cisco Systems offers a range of products. On one end of the list is the 1020, which is a "personal router." The 1020 will route between one Ethernet network and either one or two phone lines. The 1020 can multiplex between the phone lines to effectively double the bandwith. At the other end of the Cisco spectrum is the 2500 series of routers, which can handle multiple serial lines very effectively. Cisco makes other lines of routers as well, but these are the ones that make sense for dial-up.
Telebit Corporation offers a pair of terminal servers well suited to the task: the Netblazer ST and the Netblazer 40i. Both are x86 based, with the 40i being a larger version of the ST. Both handle multiple serial lines and multiple Ethernets. Telebit also offers PRI and V.35 interfaces for ISDN applications.
US Robotics offers a managed hub technology that allows a variety of cards to plug and play for a total solution. These include standard analog modems, digital modems, PRI/BRI and T1 cards, and even an x86 engine that runs Windows NT to provide terminal service in the chassis.
For serious forays into the realm of ISDN, both Cisco and Gandalf make site-class hubs for handling multiple PRI/BRIs and multiple LAN connections. Note that not all vendors' ISDN hardware is cross-compatible and that a single vendor solution is preferred. Also make sure that all of the telephone companies in the area support ISDN; otherwise, some users will be unable to use the services provided.
The choice of terminal server depends greatly on the demands of the customer base, the functionality required, and the available budget. Both Multitech and Boca make a modem rack that will hold sixteen 33.6 Kbps modems. Couple this with a Cisco 2500 or a Netblazer ST to get reliable dial-up services for under $10K.
What Information is Needed
In yesterday's environment, the number of remote users that could be supported was a moot point. The cost of the dial-up capability was so prohibitive on a large scale that it wasn't even considered by most businesses. In today's environment, not being able to remotely access the office may cost a business more than the dial-up service itself. So, how do you determine who is dialing in, how often, and how long they are on the line? Without this information, it is very difficult to fend off the hordes of people claiming to get a busy signal or to justify the expense of expanding your capacity. By identifying who is dialing in, or attempting to, you can trap potential security breaches as well.
syslog Provides Data
One of the best ways to get this information is with syslogd. All of the terminal servers surveyed offered some kind of syslog capability. With syslog, you can send activity information from one or more terminal servers to a central location for post-processing.
syslog passes a message to syslogd, which may append it to a log file, write it to the system console, or forward it to another host on the network, depending on the configuration of /etc/syslog.conf. syslog generates messages of varying levels. These levels can be used to sort and trap messages of particular types. These levels are described in detail in the local system help files and are not included in this article. By specifying a syslog host to your terminal server, you can capture its activity. For example, the entry for a Netblazer looks like:
syslog to host is on <Host IP>, facility = local0, level <= 7
The corresponding entry in /etc/syslog.conf on the host referenced by Host IP looks like:
These two items will cause the Netblazer to start sending data into the logfile on the specified host. By judiciously setting the level on the Netblazer and the level in the config file, you can have several log files contain different types of messages. The line above causes the information from the Netblazer to be dumped in with the system's own logging information. The scripts discussed later in this article will strip out the pertinent information. Once these changes are made and syslogd is restarted, or forced to reread its config file, the terminal server will start reporting something like the following:
Feb 25 14:20:18 squid.micro.ti.com syslog: inbound
dynamic interface 'username-ppp' used line100 for
472 seconds 188.8.131.52
Feb 25 14:20:18 squid.micro.ti.com syslog: username on
line100 at Feb 25 14:12 for 472 seconds
Feb 25 15:04:45 squid.micro.ti.com syslog: start
login on line100
Feb 25 15:05:09 squid.micro.ti.com syslog: user1
logged-in from line100
Feb 25 15:05:10 squid.micro.ti.com syslog: user1
rlogin to shark from line100
Feb 25 15:05:58 squid.micro.ti.com syslog: interval:
user1 logged-in from line100
Feb 25 15:12:22 squid.micro.ti.com syslog: user1
rlogin end on line100 to shark for 433
Feb 25 15:12:23 squid.micro.ti.com syslog: user1
on line100 at Feb 25 15:05 for 434 seconds
From this information, you can see the username that logged in, the line the connection came in on, the start time of the session, and the duration of the call.
Put the Data on the Web
Squid, which is a collection of scripts that interact to create HTML for viewing, was developed to post-process the syslog information into useful data. This data can then be used to justify expenses, monitor access, and defend perceived outages.
Squid is driven by cron. This is a must in our environment, because we cannot afford the people resources to manually collect and process this type of data. The first script is the cron job that is run once a month to generate all the reports. The frequency is arbitrary and can be modified as desired.
# Run Squid logs once a month
01 00 28 * * /home/thomas/sys_admin/squid/cronentry
The cron job shown in Listing 1 performs the following actions:
Moves the log file to a logs directory and gives it a timestamp extension.
Creates a new (empty) log file for syslog to write the next month's data into.
Sends syslog a HUP, which is necessary to restart logging. Moving the file out from under syslogd can have undesirable results.
Uses awk, sort, and grep to create a pair of intermediate files. Each of these files contain similar information just sorted with a different key. The keys of interest are username and line number. extracted.1.$MONTH is sorted by line number, and extracted.2.$MONTH is sorted by username.
Compresses the original data file for long-term storage.
Calls a pair of Perl scripts that create HTML files from the sorted data files.
Although this process is somewhat inefficient, as opposed to doing the sorts and processing in a single Perl application, the intermediate files can be used to generate some other graphs for presentation of the data in a slightly different form. Because the process is run so infrequently, performance tuning isn't an issue.
Listing 2 shows the awk fragment that picks the pertinent data from the raw log file and formats the intermediate files.
Listings 3 and 4 are the Perl scripts that convert the raw data into HTML for viewing. Again, these two scripts are almost identical. They vary only in which field they process and how they label the table. These can be combined into one script for efficiency.
The first part of the script splits the input line and loads arrays with the desired data. This initial read also tallies the maximum values and line or user counters. The second part of the script processes the arrays and tallies the information on a per-line or per-user basis and outputs HTML.
Each time the cron job runs, it creates a pair of HTML files that contains tables for one month's worth of data. The script line.pl produces the tables shown in Figure 1. For each dial-up line, there is an entry in the table. If a line has multiple accesses, then average and maximum usage for the line is displayed. The second table is just a summary of all the activity for the dial-up service for the month. The script user.pl creates tables just like before, only these are user based.
Listing 5 is a hand-coded HTML page that provides a cover for the files created by the cron job. These two tables provide a way for users to browse 12 months of dial-up data as it becomes available.
Use the Data to Manage the Environment
You can use the tables to:
Obtain a general picture of dial-in usage
Identify heavily used lines
Identify heavy users
Monitor total capacity
From these particular tables however, prime time cannot be determined. Listing 6 is another script that processes the extracted.1.$MONTH file into a very large file. This file can be imported into a spreadsheet program and used to create a radar plot that will show prime time and when users are most likely getting busy signals.
The radar plot in Figure 2 shows the number of lines concurrently in use at 10-minute intervals for the month. From the graph, it is clear that prime time is from about 7:00 p.m. until midnight. The time of day is incremented around the circle as radii, and the number of concurrently used lines are the cylinders. The broken cylindrical boundary at 14 indicates when users are likely to be getting busy signals. Since the time slice is 10 minutes, it's possible to get multiple accesses on a line in the slice, which appear as an extra line. This causes the total number of lines in use to appear to exceed the physical maximum number of lines - 14 in this case. The time interval can be made smaller. Just make sure that your spreadsheet can handle the additional data.
Each type of terminal server will write slightly different forms of information to syslog. You can modify the initial awk scripts to generate the same format of intermediate files, so the secondary scripts will always create correct HTML. Nothing presented here is overly complex, and most system administrators should be able to tailor this to their own installation.
The intermediate files possess untapped wealth. Additional scripts can provide information targeted at other areas, such as security. More frequently run cron jobs can proactively capture unauthorized attempts or excessive usage and provide email or pager notification.
Use the Data to Manage Management
One of the hardest tasks a system administrator faces is providing data about the environment. While all system administrators are keenly aware of the shortcomings of their environments, convincing managerial and financial folks can be agonizing. Without data it's just another opinion. Squid provides that data, which can be used to make the system administrator's job easier, make the dial in access more secure, and provide immediate visibility for the general user community that "the dial-in thing" is really working.
The key to successful monitoring is selecting vendor for the terminal server that provides good syslogging capabilities. From that point, the rest is easy.
Hunt, C. 1994. TCP/IP Network Administration. O'Reilly & Assoctiates, Inc.
Network Working Group - RFC 1055, 1171, 1172
About the Author
Jeffrey R. Thomas, P.E. is the manager of ASP workstation support for Texas Instruments, Inc. in Stafford, TX. He has been a system administrator for 13 years. His current installed base consists of 400+ Unix