A Perl Package for Monitoring Traffic
John Shearer
Systems and network administrators constantly struggle to know
what is happening on their networks. This job is difficult at best,
and at worst, it can be downright exasperating. With the myriad
of manufacturers, standards, protocols, and everyone thinking they
have the tool that does it all, the situation is not improving.
With this challenge in mind, I decided to turn to my old friends
for help: Perl, SNMP (Simple Network Management Protocol), CGI,
and good old-fashioned HTML.
The rtr-graph package described in this article is a set of Perl
scripts for polling routers (or other SNMP-enabled devices) for
information about traffic in and out of specified interfaces. You
can set up "rtr-traff" as a cron job to poll the
interface at a specified interval, then use a CGI script for a Web
front end to the finished graphs. The Web interface automatically
sorts results from different devices into separate drop-down lists.
You can also set up multiple config files to poll different devices,
change final graph specs, and set up new parameters. This concept
was originally designed to check our Internet T1 interface for traffic
levels during the day. It has since evolved into a versatile program
that gathers statistics from any device to check problems, get baselines,
or just see what's going on.
Rtr-graph allows the administrator to quickly gain access to traffic
statistics in specific areas of the network. When trouble spots
(or suspected trouble spots) arise, rtr-graph can be quickly adjusted
to monitor a certain area. The Web interface allows the administrator
to instantly view multiple graphs spanning several days to see trends
and set benchmarks (Figure 1).
In the next several sections, I will cover the installation and
use of the various Perl modules required for rtr-graph to function.
Once all the required components are explained, I will cover the
installation and use of the rtr-graph scripts. My current implementation
of rtr-graph is based on Red Hat 7.0 with a basic RPM installation
of Perl version 5.6.0, and Apache 1.3.12.
SNMP -- Simple?
The Simple Network Management Protocol (SNMP) can be anything
but simple. It is a low-level protocol used to manage and monitor
devices on a network. SNMP can provide an incredible amount of information
about your devices, as long as you know how to obtain and decipher
it. Most routers, many switches, and even end devices like workstations,
servers, and printers can be managed using SNMP. The Perl implementation
of SNMP that I am using is Net-SNMP, which offers an object-oriented
interface to the Simple Network Management Protocol. Net-SNMP, by
David M. Town, can be downloaded from:
http://www.perl.com/CPAN-local/modules/by-module/Net/
The current version as of this writing is 3.60. To unpack this Perl
module, download it to a local directory and issue the following command:
tar -xzvf Net-SNMP-3.60.tar.gz
Next, change to the newly created directory and issue these commands:
perl Makefile.PL
make test
make install
Installing GDGraph
The graphing functionality of Perl requires several packages to work
properly. Since hundreds of possible permutations exist in Linux installations,
I will mention the packages needed and discuss some of the possible
solutions (see resources section for more information).
The idea is to have a functioning installation of the Perl module
GDGraph, but this module has several prerequisites. Before installing
GDGraph, you must have zlib, freetype, libpng,
jpeg-6b, and gd installed and functioning. Luckily,
Red Hat Linux comes with all of these packages in RPM format, and
they work just fine.
You will also need to download the Perl modules GD-1.32, GDTextUtil-0.80,
and GDGraph-1.33 from:
http://www.perl.com/CPAN-local/modules/by-module/GD
Once you download and un-tar these archives, you can install them
by going into each of the newly created directories and entering the
following commands:
perl Makefile.PL
make
make install
GDGraph has the optional command "make samples",
which will create a subdirectory containing many excellent examples
of how to use the GDGraph Perl module if desired.
The last component to install is the ImageMagick package. This
package takes the individual graphs created for each interface and
uses the "Montage" function to combine them into a single
image. Again, the Red Hat RPM will work just fine for this install.
The rtr-graph Scripts Once all the prerequisite modules
are installed, it is time to install and look at the scripts in
the rtr-graph package. Download the tar file rtr-graph-1.0.tar.gz
and put it into a temporary directory. Next, issue the command:
tar -xzvf rtr-graph-1.0.tar.gz
This will create a new directory called rtr-graph and extract the
scripts rtr-traff, rtr-graph, traf-lister.cgi,
and the sample configuration file traffic.cfg. These files
must be manually moved to their final locations: rtr-traff
and rtr-graph to a bin directory such as /usr/local/bin;
traf-lister.cgi to a Web cgi-bin directory; traffic.cfg
to /etc. To understand how the scripts work together, let's
look at the configuration file (Listing 1).
traffic.cfg is just an example and will need to be modified
to suit your own needs. You will need a different configuration
file for each device you want to poll. Each variable will need to
be defined, but some of them can be left to their defaults. Here
is a quick explanation of each of the variables:
$name -- This can be set to any name for this particular
setup that will be understandable to you. This name will be used
for graph names, data file names, etc. It is a good idea to name
this the same as the configuration file for all features of rtr-graph
to work (e.g., if the configuration file is named /etc/traffic.cfg,
set this value to "traffic").
@interfaces -- This is a list of the interfaces on
the device to be polled. You may need to use some other SNMP management
software to know what the interface indexes are, but they usually
follow the pattern of 1, 2, 3, etc. You also have the option of
adding a colon-delimited value for the graph y-axis for just this
interface.
$y_axis_def -- This is the default value in megabits
for the graph y-axis. This value is overridden by any values specified
in the @interfaces list. Again, this value is in megabits
(e.g., a value of 10 will give the y-axis a value of 10,000,000).
$tmp_dir -- The directory to be used for the data files
and some temporary files needed by the rtr-graph script.
This directory will need to be writable by the executor of the scripts
and the httpd daemon in order for all the functions of the
Web interface to work.
$www_dir -- The directory that will store the finished
graphs. Again, this will need to be writable by the executor of
the rtr-graph scripts and the httpd daemon.
$offset -- This is the number of seconds to offset
the date for which a graph is generated, if no date is given. For
instance, a value of 86,400 seconds means that, if no date is given,
the default is to generate a graph for 86,400 seconds ago, or "yesterday".
$comm -- The Read community string for the device being
polled, usually "public".
$address -- The IP address or resolvable name of the
device to be polled.
$tile -- This is the grid size of the final graph specified
as "WIDTHxHEIGHT". If you were polling one interface then
you would specify "1x1". If you are polling eight interfaces,
then you could specify "8x1", "4x2", etc. This
depends mostly on how you would like to see the final display.
$skip -- This value specifies how often the graphing
program should insert an x-axis value. If you poll the device every
minute, trying to show every minute as an x-axis value would be
quite unreadable. I find these values to work fairly well:
- 1-minute poll -- $skip = 50
- 5-minute poll -- $skip = 10
- 10-minute poll -- $skip = 5
$gr_width and $gr_height -- These values specify
the width and height of the individual graphs. Again, this is user
preference and is highly dependent on how many interfaces you are
polling and the resolution of the display where you will be viewing
the graphs.
rtr-traff Script
I will now explain the rtr-traff script (Listing 2). rtr-traff
is the mechanism that polls the device for the number of octets
sent and received from each interface. This script takes one required
argument, which is the name of the configuration file to be used.
This configuration file should be specified as just the file name,
because the script will assume it is in the /etc directory.
Here is an example cron entry that will poll the router every five
minutes:
*/5 * * * * /usr/local/bin/rtr-traff --cfg traffic.cfg
Using the Net::SNMP Perl module, rtr-traff creates an object
for the device to be polled as found in the configuration file. It
then polls the two MIB objects specified as $inmib and $outmib
with the actual interface index appended to the end of each object.
These values are stored in a data file (the location specified in
the configuration file) along with the time in seconds since epoch.
The data file name is derived from the $name variable, the
current date, and the interface index. This allows for each interface's
data to be stored in its own file. Because the data file name contains
the date, the files are automatically rotated every day.
rtr-graph Script
Next comes the rtr-graph script (Listing 3), which is the
real workhorse of the system. This script can be broken down into
four basic components: 1) startup and setting of variables; 2) processing
the data collected into a more usable format; 3) making the small
graphs for the individual interfaces; 4) generating the final picture
and do some cleanup.
Let's consider these components one at a time. Most straightforward
is the setup where rtr-graph sets variables and offers help
if requested. It also checks for command-line arguments, such as
the required configuration file. We also have an option of specifying
a date for the graph. This gives us the option of re-creating graphs
for previous days (as long as we still have the data files) and
creating graphs on the fly for the current day. There is a built-in
option for specifying "today" as a valid date, so you
don't have to figure out today's date yourself. This makes
it a little easier to use "rtr-graph" in other
scripts without having to write code for date calculation.
Next, rtr-graph parses through each data file and puts
the polled information into a graphable format. One interesting
SNMP problem is that the values for ifInOctets and ifOutOctets are
stored in octets (bytes) instead of bits, which is what most people
are used to. This is handled by the $scale variable which
is set to .125 (or 1/8) to give a more readable value. This could
also be set to 1 for bytes, 128 for kilobits, 1024 for kilobytes,
etc. Just remember to change your graph labels to reflect this!
Another SNMP issue to overcome is the fact that integers stored
in a MIB counter have a 4- byte maximum value, or 4,294,967,294.
Even on a 1.5-Mb T1, this counter can roll over several times a
day. To handle this, the script checks whether the current value
it is processing is less than the last value it processed (meaning
a rollover has occurred) and uses this maximum to calculate the
actual data throughput during that period.
rtr-graph is also able to accurately graph using variable
sample periods. It uses the time stamps to see how many seconds
passed between samples, and then divides the traffic by that value.
This solves two potential problems. First, the administrator can
set different sample rates for different devices, or change the
sample rate on a device, while being assured he is getting an accurate
bits-per-second reading; Second, if the rtr-traff script
misses a reading (which can happen for any number of reasons), the
graphing script will just treat the gap as a long sample and still
give an accurate reading.
Now I will use the functions of GDGraph to make the individual
graphs for the interfaces. The documentation that comes with GDGraph
is fairly clear about how to create a variety of different graphs,
and the samples provided are excellent. I simply feed the newly
created graph object the data just compiled, along with a few pre-set
variables for labels, graph size, etc. These individual graphs are
stored in png format in the temp directory specified in the
configuration file. I used to use the GIF format for my graphs,
but during a re-install some time ago, I found this excerpt from
the GD package README file:
===> This version of GD no longer supports GIF output because of
===> threats from the legal department at Unisys. Source code
===> that calls $image->gif will have to be changed to call either
===> $image->jpg or $image->png to output in JPEG or PNG formats.
===> The last version of GD that supported GIF output was version
===> 1.19.
In the final step, rtr-graph uses the Montage function of ImageMagick
to compile the individual graphs into a single graphic. Here the $tile
setting from the configuration file comes into play. If you are polling
one, two, or even three interfaces, a single row of graphs will be
okay. I pushed the potential of this system and polled all 32 interfaces
of a high-end router we were using. It worked like a champ, but that's
a lot of graphs to try to look at on the screen! In this case, I actually
opted for a $tile value of "32x1". Even though this
created a final image of 9600 pixels across, I found it easier to
get comparisons over several days by viewing one day per row. YMMV.
As part of the final step, rtr-graph removes the temporary
graphs and sets the owner and group permissions of the final image
to root. This is necessary because the Web interface can generate
the graphs on its own, which will give the image the owner and group
permissions of the httpd daemon. Even if rtr-graph
re-generates the graphs during its regular execution, the ownership
will not automatically change to root (or whoever the cron job runs
as) if the image already exists. This step is not absolutely necessary,
but I feel better knowing that the Web daemon (or rogue Web processes)
couldn't alter or remove the images once they were written.
Finally, this script needs to be run at a regular interval just
like the rtr-traff script. An example cron entry to execute
this script every morning at 12:30 might be:
30 00 * * * /usr/local/bin/rtr-graph --cfg traffic.cfg
Web Interface
The final piece to this system is the Web front-end for viewing
the graphs. This script (traf-lister.cgi -- Listing 4)
needs to be put in a cgi-bin directory on your Web server.
I put this script in a separate directory from my public scripts
in order to maintain some security. I'm not sure how this information
could necessarily be used maliciously, but the directory was already
set up for other administrative scripts, so it seemed like the logical
place to put it.
Because the traf-lister.cgi is not necessarily dependent
on any specific rtr-graph configuration file, some variables
must be hard-coded directly into the script. Those variables are:
$www_dir -- This is the full path to the directory
where the graphs are stored.
$cgi_dir -- The directory where this script resides.
This is most likely an alias that is defined in your Web server
configuration files (e.g., httpd.conf in Apache).
$bin_dir -- The directory where the rtr-graph
script is stored. This is necessary if you want to be able to generate
current graphs from the Web interface.
$url -- This is the full URL to the directory where
the traffic graphs are stored. Again, this may be an alias defined
in your Web server configuration.
I tried to make the Web interface as intuitive as possible without
cluttering it up with a lot of unnecessary text and instructions
(Figure 1). Basically, each graph that is stored in the $www_dir
directory (designated by the .jpg extension) is placed
in a list based on its prefix, which is defined by the $name
variable in the configuration file. Radio buttons correspond to
each list, so you can generate a new graph for that device. If you
try to graph too many devices with too many interfaces, you may
encounter a time-out situation on your browser. This has never happened
to me, but the possibility exists.
To view the finished graphs, simply select as many images from
as many lists as you want and hit the SUBMIT button. Most browsers
support the [CTRL]-Click and [SHIFT]-Click options for selecting
multiple items in a list. The traff-lister.cgi script will
also remember your last selections; so, to remove them from your
next SUBMIT, select the auto-generated "No Graph" entry.
All the scripts in this package were designed with configuration
simplicity in mind. This allows an administrator to make a logical
separation in the devices monitored for easier viewing. For instance,
a number of printers can be polled and the graphs stored in /www/html/printers,
and a number of routers can be polled and stored in /www/html/routers.
Then, two different viewer scripts can be configured, so you can
quickly jump to just the devices you are looking for.
Conclusion
While rtr-graph cannot replace all commercial network analysis
products, it does offer an easy, automated way to view particular
devices that are potential trouble spots. With its ease of configuration,
an administrator can easily customize the output. Figure 1 shows
an Internet router with one Ethernet interface that is being graphed
with a 10-Mb scale. It also shows two T1 interfaces that are graphed
with a 1.6-Mb scale (a T1 actually has a maximum of ~1.54 Mb, but
1.6 Mb makes the graph easier to view). With the separation of "Traffic
In" and "Traffic Out", we can also see exactly how
the data is moving through this device.
A nice advantage rtr-graph has over some other products
is the ability to view the statistics of several interfaces at the
same time. Not long after we put our load-balanced T1's into
production, we saw that even though they looked like they were balanced,
the Ethernet interface on the router showed we were still only getting
single T1 performance and that the traffic on the T1's were
actually mirrored instead of balanced. Without being able to see
the T's and the Ethernet interfaces at the same time, the problem
may have gone unnoticed for quite a while. (Figure 1 shows an example
of this discrepancy.)
There's one last bit of maintenance to perform. Your temp
directory can quickly fill up with data files if you're not
careful. I like to keep these files around for a few days just in
case, but I have never needed them beyond that. A useful program
for handling this problem is the tmpwatch utility, which
can remove old files from any directory you specify. Red Hat systems
automatically run this utility daily to remove files from the /tmp
directory that are more than 10 days old.
Where Do We Go from Here?
A logical step for future implementation would be to use an SQL
database to store all the polled information rather that cluttering
up the system with text data files. As long as you're doing
that, you might as well poll for other types of information like
errors, broadcasts, etc. Most of the work has already been done,
you would just need to add the new MIBs to the polling script and
send the collected data to your database instead of a text file.
Then, you could make your Web interface so you could select only
the information and time period you want and have it create a custom
graph on the fly!
Resources
Net-SNMP -- http://www.perl.com/CPAN-local/modules/by-module/Net/
zlib -- http://www.info-zip.org/pub/infozip/zlib/
freetype -- http://freetype.sourceforge.net/
libpng -- http://freesoftware.com/pub/png/src/
jpeg-6b -- ftp://ftp.uu.net/graphics/jpeg/
gd -- http://www/boutell.com/gd/
GD, GDTextUtil, GDGraph -- http://www.perl.com/CPAN-local/modules/by-module/GD/
ImageMagick -- http://www.imagemagick.org/
John Shearer helps manage about 2000 laptops and desktops at
a boarding high school in Massachusetts. Although he comes from
a primarily Dos/Windows background, he has learned of the wonderful
world of Linux and devours anything Linux-related that he can get
his hands on. When he's not putting the final touches on a
new script, he loves to chase around his two-year-old son. He can
be reached at: jshearer@nmhschool.org.
|