Cover V11, I09

Article

sep2002.tar

IP Space Monitoring System

ChokSheak Lau

Systems and network administrators typically deal with hundreds and thousands of computers, and it can be hard to track all the IP addresses that are actively in use. This article presents a concept of implementing IP address space monitoring software for statically assigned IP addresses, and how we currently use it to provide real-time network data online.

There are already network monitoring programs available; many are, in fact, freely available on the Internet. Much of this existing software serves to periodically poll each IP address to check whether it is online. Alternatively, programs might use SNMP to retrieve information or receive trapped signals from subnet nodes. However, the concept that I am presenting can record the online history of each node in the subnet and detect changes in the hostnames for each IP address.

Background

Our network administrators wanted software that would provide reliable real-time data on the percentage uptime of each IP address in the entire 1500-node LAN. We assume a subnet is not using DHCP, but rather that the subnet consists of statically assigned IP addresses for each node. The key objective is to reclaim an existing IP address for reuse if that IP address is not adequately utilized. Therefore, we searched for a piece of software to do just that -- collect uptime information for the network and generate summary statistics. We couldn't find any software providing all these functions, so we created our own system of IP tracking.

This IP space monitoring system software was designed to perform the following tasks:

1. Poll each node in the entire LAN periodically. We chose a time interval of five minutes per poll. The polling (ping) is a lightweight process that can be done frequently without noticeably affecting the performance of the network.

2. Provide real-time information on a Web user interface on the percentage uptime of each node in the network. The software is able to track the percentage uptime for each 5-minute segment of the 24 hours per day.

3. Provide a historical account of when each IP has a change of hostname and the date and time that it changed.

4. Check for duplicate hostnames within the entire LAN. (Tracking hostnames can also be a problem if you have more than a thousand hostnames in the subnet.)

As mentioned previously, I found no piece of software that provides the above features. If anyone knows of software that does, I would be happy to check it out.

Polling

I chose to use Nmap to do the ping scanning (polling) for the nodes. The man page of Nmap says:

It should also be noted that Nmap has been known to crash certain poorly written applications, TCP/IP stacks, and even operating systems. Nmap should never be run against mission critical systems unless you are prepared to suffer downtime. We acknowledge here that Nmap may crash your systems or networks and we disclaim all liability for any damage or problems Nmap could cause.

Because of the slight risk of crashes and because a few black hats like to use Nmap for reconnaissance prior to attacking systems, there are administrators who become upset and may complain when their system is scanned. Thus, it is often advisable to request permission before doing even a light scan of a network.

Nmap should never be run with privileges (eg suid root) for security reasons.

Therefore, Nmap is not the safest IP scanner utility available. But the overriding reason we chose Nmap is that it can extract information on the OS for each IP. And, knowing the operating system is a highly desirable feature to include in our scanning tool. So far, Nmap has not caused us problems. If the OS information is not required, the task can be performed by using both ping(8) and gethostbyaddr(3) (under Linux).

Our scanning software performs a basic ping scan on each IP address to check whether the IP is alive. If that IP address is newly found to be alive, then it is inserted into the database as a new entry. Otherwise, the software will update that IP's information within the database (increment the counter for that 5-minute time segment of the day when that IP is alive). For instance, if the time segment starts at 0000 hours (midnight), then any time between 0000-0004 hours falls within the first time segment, and any time between 0005-0009 hours falls within the second time segment, and so on. After all IP addresses are scanned, the total counter for that time segment of each IP will be incremented. Thus, we can calculate the percentage uptime for each time segment for each IP using the formula (for illustrative purposes only):

percentage uptime for time segment N = (active counter for N)/(total counter for N)

For each counter, 288 unsigned integers are compacted and stored as a blob (binary large object data-type in MySQL), and each IP has two blobs -- one for the active counter and one for the total counter. For all these nodes, the size of the database is less than 10 MB. Note that each counter value is stored as a numeric string with a fixed length of 8 bytes. We found that storing the counters as packed binary, unsigned, long integers resulted in premature truncation of the blob data when using Perl DBI with MySQL. This peculiarity (bug?) was discovered during testing.

We chose the MySQL database because it is very fast and reliable and can be used easily with the Perl DBI package. Both packages were found to be reliable and caused no problems in downloading and installation. Both are also extensively documented. We chose Perl 5.6.0 because it provides maximum programmer efficiency and has a wide range of freely available libraries.

Optimizing for Speed

To serve database queries on real-time data, we wrote a Web interface that allows the user to select the data columns to display, the sorting criteria, the filtering criteria, and so on. Since it would be a waste of disk space to store data that could be derived using other data within the system, our initial database design did not store any derivable data. An example of derivable data is the string IP address ("123.123.123.123" = 15 bytes), which is stored in the network long format (4 bytes). However, we have to convert the saved data into a human-readable form every time we display it, which slows down the system.

Thus, I changed the format of the database to store all the data that could be derived from other data. The final database format looks like this, as a MySQL command:

create table iptable
(
   # derived field
   IP               char(15),              # string IP derived from LongIP

   # original fields
   Hostname         varchar(255),
   OS               varchar(255),
   TimeActiveCounts blob,                  # sequence of 288 unsigned long, 8 bytes each
   TimeTotalCounts  blob,                  # sequence of 288 unsigned long, 8 bytes each
   DateInserted     datetime,              # date the IP was inserted into the database
   LastActive       datetime,              # date the IP was found to be last active

   # derived fields
   Active           int unsigned,          # sum of TimeActiveCounts
   Total            int unsigned,          # sum of TimeTotalCounts
   PercentActive    int unsigned,          # integer( 100 * Active / Total )
   DaysOffline      int unsigned,          # NOW() - LastActive
   LastModified     datetime,              # _LastModified in human-readable format

   # non-display fields
   _LastModified    timestamp,             # automatically updated by MySQL 
                                           # whenever the entry is modified
   LongIP           int unsigned not null, # IP in network long format, 4 bytes
   primary key( LongIP )
)
The plan now is to refresh all the derivable data ("derived fields" above) every time a database query is generated. This provides only a slight improvement in speed because the only operation that is avoided is to re-convert the long IP address to the string IP address. I decided to try an unusual strategy -- to display the existing data first, and then update the database. This means that the first view of the database is always outdated, which is hardly noticeable because the data has already been accumulated over months and the derived data is updated once daily using a cronjob. Therefore, the data is always updated within 24 hours by refreshing the database daily. The result is that the time taken for any database query gets cut down from more than 30 seconds to less than 2 seconds. The overhead in summing up the data within the blob columns is the greatest slowdown, and that is now executed quietly in the background.

Note that when storing IP addresses in the database, we always want to store in both network long format (4 bytes) and string format (15 bytes). The network long format allows us to sort the IP addresses correctly, while the string format allows the IP addresses to be displayed in human-readable form without conversion.

Dynamic Web Interface

We chose PHP 4.0 because it is fast, free, and easy to use. PHP comes with built-in support for MySQL, which is a big help. Any information that is useful to the systems administrator is also useful to an intruder. Make sure you apply a level of security for this system that is appropriate for your network. To secure the data over the Web interface, be sure to use SSL. You can provide a MySQL login Web page, and on top of that, provide a UNIX login to restrict access to only certain users and domain names, which can be done by configuring Apache.

Summary

I hope that this article has shown a useful way of thinking about network monitoring. Putting this IP space-monitoring system together is a matter of using the right tools for the job, and no complex algorithms or socket programming skills are necessary. Rather, some reference books, some time, and energy are required. With this setup, we can now do a regular expression (regex) search for all the IP addresses that match a particular regex pattern, such as "123.123.123.*". And, we don't have to key in all the IP addresses manually to be able to do that. This system, called "ECE Scan", is currently in use by the School of ECE, Georgia Tech.

All code for this article is available at:

http://www.sysadminmag.com/code/
ChokSheak Lau is a Software Engineer at the Georgia Institute of Technology. He has a Bachelor's in Computer Science, with expertise in Web authoring with database backend. He can be contacted at: chok@ece.gatech.edu.