Administering
a Distributed Intrusion Detection System
Johannes B. Ullrich and Wayne Larmon
Intrusion detection systems have become ubiquitous. Due to the
existence of cheap commercial (IDS) systems and relatively easy
to use freeware, an increasing number of small business users, and
even home users, are collecting valuable intrusion detection logs.
In the same way that malware (malicious software) has prospered
under the collaborative effort of contributors, intrusion detection
systems can also benefit from a boost in efficiency through collaboration.
This article describes the efforts of Dshield.org (http://www.dshield.org)
to build a global, distributed intrusion detection system. Dshield.org
assembles and analyzes detection log data from networks all over
the world. In all, DShield.org processes one- to two-million events
per day. DShield's users include many home users and some operators
of large networks. This user base provides the DShield database
with a roughly representative sample of network activity "in
the wild". DShield studies the incoming data for unusual activity,
producing a daily block list of sites that appear to be participating
in Internet attacks, and notifying ISPs that their systems may have
been compromised. See the sidebar for the principles of DShield.
The payoff of this system is already evident. Internet worms have
been detected earlier, and the development of countermeasures has
been accelerated by the early detection provided by DShield.org.
System Operation
DShield.org provides a number of "clients", which are
small programs used to submit the logs. Typically, the client will
parse the log at the user's machine and submit a log in a standardized
format to DShield.org via email. Incoming email messages are queued,
parsed, and a number of reports are generated showing current reported
activity and trends over time.
A number of additional services that show the benefits of such
a system are provided: a recommended "block list" of networks
that are currently scanning other networks; the option to report
sources of attacks to network operators; and the ability to find
out whether other systems are being attacked by the same sources
using the same methods.
The closest comparable service to Dshield.org is ARIS, operated
by SecurityFocus (http://aris.securityfocus.com/). ARIS focuses
on corporate users and offers extended reports and alerts as part
of its "Predictor" service. Some firewall and intrusion
detection companies also offer correlation software that providing
similar functionality. This software, if implemented, will correlate
all events collected by participants. However, participants are
limited to organizations implementing this system and do not provide
a global view.
Reports
The "Copernican Principle", known to astronomers and
others, can be applied to many phenomena. This principal assumes
that one's position as an observer in the universe is not special
in any way, and therefore, one's observations can be considered
representative. However, this principle is not considered valid
for intrusion detection. Many threats target particular networks
or prefer some networks over others. Therefore, it is critical to
compare one's observations to others in order to draw the correct
conclusions.
A distributed intrusion detection system (dIDS) provides an easy
way of doing this. For example, DShield.org provides each user with
customized reports detailing how many other users observed the same
source or the same attack. In some instances, it will also link
to more detailed descriptions of the nature of the attack. For the
public, reports are generated showing the overall distribution of
attacks by origin and type.
Two products of the DShield.org -- a proactive block list,
and reports sent to administrators of systems implicated by reports
("Fightback") -- will be explained in more detail
below. Most reports are consolidated at the "Internet Storm
Center" (http://isc.dshield.org), which provides cross-linked
reports to quickly analyze current events.
Geographic Attack Distribution
Early on, one problem was finding the geographic distribution
of attack sources. It is an often-voiced opinion that Asia is contributing
to attacks more than other geographic areas. This assumption is
mostly validated by data collected so far. However, it appears that
the number of attacks originating in Asia is as much a factor of
vulnerable systems in these countries being used as relays as it
is a factor of users in these areas instigating attacks. DShield.org
recently started collaborating with the Korean Computer Emergency
Response Coordination Center (KRCERTCC). A daily data feed summarizing
the attacks that originate in Korea is used by the KRCERTCC to track
an Internet Service Provider's (ISP) performance over time.
Attacks by Target Port
A first indication of the nature of an attack is the port targeted.
While it is not conclusive to use the target port to identify an
attack, shifts in port targets are indicative of shifts in attacks
used to target networks. In several cases, these shifts have shown
early in the outbreak of new attacks. Figure 1 shows the increase
in port 80 reports as a result of the Code Red outbreak in July
2001. As early as July 13th, DShield.org provided an indication
of a significant increase in port 80 attacks.
DShield developed a process to further follow up on this report.
If a significant increase is detected, the user who originally submitted
the report is contacted. Further information provided by these users
(i.e., full packet logs or statements regarding network configuration)
is analyzed. If there is reason to suspect a new attack, we attempt
to capture the responsible code and issue a warning. This approach
maintains the agility of the system, which is based on limited header
information, and enables us to back up an alert with additional
data, if necessary.
Attack Persistence
As an attacker scans large network blocks, a single target will
not be able to ascertain whether the attack against it was a single
"slip" (e.g., a user typing a wrong IP address), a targeted
attack, or part of a widespread hunt for vulnerable systems. A collaborative
system like DShield, however, can follow an attack source as it
scans multiple networks. Figure 2 shows a graph of the persistence
of attacks. The plot shows the time between the first and last attack
reported to DShield.
Interestingly, the distribution can be explained by a statistical
fit using the assumption that 99.5% of the systems are taken offline
after an average of five hours ("half life time") and
the remaining 0.5% will remain scanning for an average of five days.
The function used for this fit is a sum of two exponential decays:
A (r1 *exp (- x*ln(2)/h1) + r2 * exp ( -x*ln(2)/h2) )
where A is the total number of infected machines in the beginning,
r1 and r2 are the fraction, which are part of the slow and fast
component (r1=0.95, r2=0.05), h1 and h2 are the half-life time,
after which 1/2 of the infected machines are fixed (h1=five hours,
h2=five days). (Later, we will describe our "fightback"
program, which attempts to improve this ratio.)
Proactive Block List
A common use of intrusion detection systems is to assemble a list
of "blocked" or "banned" IP addresses. For example,
if an IDS monitoring a public Web server, which cannot block port
80 for incoming traffic globally, detects a large number of http
intrusion attempts from a given network, it may decide to block
future access to its system from this network. However, such a block
can only be implemented after the scan is detected, which is usually
too late.
Using a dIDS allows users to learn from attacks detected by others
and build a proactive consensus-based block list. This list will
include networks that have a recent history of being abused as attack
sources. A regularly updated list allows network administrators
to maximize the accessibility of their networks. Instead of blocking
large IP blocks, they could focus on smaller networks based on evidence
collected by others. Widespread implementation of such a block list
may also force listed networks to become more proactive in eliminating
malicious activity from the networks.
Currently, DShield generates a daily block list. It lists the
top 20 attack sources for the previous three days. Instead of focusing
on individual IP addresses, the list summarizes class C networks.
The list also includes a number of reserved addresses, which are
frequently used to spoof sources in a Distributed Denial of Service
(DDoS) attack. The list is available via http and https at http://feeds.dshield.org/block.txt
or https://secure.dshield.org/feeds/block.txt. The format
is a simple tab-delimited format, which eases parsing by automated
scripts. A PGP signature is provided at http://feeds.dshield.org/block.txt.asc.
(See http://www.dshield.org/block_list_info.html for current
information on using the block list.)
DShield.org manually reviews this block list and notifies the
networks that are on the list so that they can preclude this behavior.
The primary criteria for inclusion is the number of targets that
have reported attacks from a listed network over the previous three
days. The total number of different targets (rather than the total
number of accesses) is a better indicator of danger because the
attacks are attempting to infect or exploit a large number of machines.
Even after a network has been added to the "blocked IP"
list, attacks will continue. When implemented, this list can prevent
users from being affected by the attacks.
As an example, DShield.org includes a script to generate iptables
rules using this blocklist. Writing a script to automatically generate
iptables rules from an online-retrieved list like this poses a number
of challenges. First, the script requires root privileges in order
to run. Second, you must carefully validate the retrieved content
to avoid running afoul of altered scripts that may include wrong
information intended to block access to valid users.
While using Perl's taint mode is a minimum requirement,
the script also requires the use of digital signatures to validate
the content. The sample script below uses the PGP signature provided
by DShield. It assumes that the necessary keys are already present
in the executing user's keyring. An alternative and simpler
method is to utilize https, but many users do not have an https-capable
version of the Perl LWP module installed, and it is easier to install
Gnu Privacy Guard (GnuPG).
The script generates a separate chain called "BLOCKLIST".
Using a new chain instead of adding the rule to an existing chain
will ease maintenance and lessen the probability of its interfering
with existing rules. The "BLOCKLIST" should be called
from INPUT or FORWARD chains. A possible setup would look like this:
# allow trusted sources, which we never
# want to lock out iptables -A INPUT -s
# (...trusted ip...)
# (..further restrictions, e.g. port..) -j
# ALLOW call BLOCKLIST iptables -A INPUT
# -j BLOCKLIST
# execute remainder of firewall rules
# iptables -A INPUT ....
The same sequence can be used for other chains, like forward chains.
The Perl script in Listing 1 will retrieve the block list and add
the rules to the BLOCKLIST. The relevant PGP public keys can be found
at http://www.dshield.org/dshield_public_key.txt. You may want
to define a small chain to log blocked accesses distinctively. For
example, use a chain like:
$IPTABLES -N LOGBLOCK
$IPTABLES -A LOGBLOCK -j LOG --log-level warning --log-prefix "filter:
BLOCKLIST " $IPTABLES -A LOGBLOCK -j DROP
To use this new custom chain, change the following in Listing 1:
my $blocktarget='DROP'
to read:
my $blocktarget='LOGBLOCK'
Eliminating Attacks
It is important to notify administrators that machines under their
control have been accessing other machines in a hostile manner.
Administrators can then investigate the suspected machine to determine
whether the accesses were caused by a user performing cracking activity
or, more likely, by a compromised machine that is attempting to
compromise other machines. The vast increase (compared to the days
when only professional administrators maintained firewalls) in the
number of firewall users causes administrators to be deluged with
an amplified number of abuse reports.
One problem that can occur when individual users send abuse reports
is that activity that could be considered hostile might actually
have been caused by an innocent mistake, such as mistyping a URL.
Therefore, if individual users send abuse reports, there is the
danger that administrators will be flooded with abuse reports that
are based on innocent mistakes.
A second potential problem caused by individual users submitting
abuse reports is that most residential users of personal firewalls
are not trained in security. Consequently, they may not know how
to differentiate true hostile activity from "normal" network
activity, such as DHCP (Dynamic Host Control Protocol) authentication.
A third problem is the lack of standardization for abuse reporting
when individuals submit abuse reports. Each one is different, meaning
that administrators receiving these reports must spend additional
time studying them to determine whether the data is significant.
A standard format abuse report allows administrators to quickly
scan for relevant information. An even better solution would be
to eliminate sending abuse reports by email altogether and replace
them with more efficient summary reports tailored to an administrator's
needs.
DShield attempts to alleviate these problems by encouraging its
users to let DShield send the abuse reports. DShield-generated "Fightback"
abuse reports are only sent after a summary report from their database
showing that accesses from a given source IP fit certain criteria.
These are:
1. From a port that we consider indicative of suspicious activity
2. Have been logged by a minimum number of separate target machines
3. Haven't been sent to the administrator for this IP in
the past month
4. At least one of the submitting users agreed to have its reports
forwarded
If, and only if, these criteria are met will DShield send a "FightBack"
abuse report to the administrator of the network that controls the
implicated source IP. The abuse report summarizes the suspected
hostile access activity, giving log samples that show details of
the suspected hostile accesses. A coded link is provided to a custom
report describing the incident and showing all accesses linked to
this source IP. This report includes accesses submitted to our database
after the abuse message was sent, so that a concerned administrator
can periodically check the database to see whether this machine
has truly ceased the hostile activity.
For large networks or ISPs, DShield provides custom "bulk"
abuse reports as an alternative to individual email abuse reports.
These are worked out on a case-by-case basis with the network administrators.
Future Plans
With the potential of more users applying firewall rules and disabling
unneeded services, intrusion attempts are more likely to focus on
the few remaining critical business services still commonly exposed
to the outside. As is already happening, more information will be
required to distinguish different types of attacks. In the immediate
future, collection of full packet content is planned from some users.
This will shorten our response time, as we will have full packets
for further real-time analysis.
Summary
Host- and network-based intrusion detection should be part of
every administrator's defense of a network against targeted
attacks. While individual IDSs are frequently criticized as being
reactive and more useful for forensics instead of defense, joining
them with a large-scale dIDS (such as DShield.org) will make them
part of a proactive weapon in the administrator's arsenal.
Acknowledgements
DShield.org is currently supported by the SANS Institute. We would
like to thank the numerous contributors and current as well as past
cooperators. In particular, we'd like to thank Alan Paller,
Stephen Northcutt, John Green, and Matt Fearnow for continued support
of our activities.
Johannes Ullrich started DShield.org in November 2000. He joined the
SANS Institute as CTO for the SANS Institute's Internet Storm
Center in July 2001. Before that, he was employed as Lead Support
Engineer by Banta Integrated Media.
Wayne Larmon is a computer consultant with more than 20 years
of programming experience. He joined DShield.org shortly after its
inception as a volunteer and is now a consultant responsible for
client development.
|