aug2002.tar

Administering a Distributed Intrusion Detection System

Johannes B. Ullrich and Wayne Larmon

Intrusion detection systems have become ubiquitous. Due to the existence of cheap commercial (IDS) systems and relatively easy to use freeware, an increasing number of small business users, and even home users, are collecting valuable intrusion detection logs. In the same way that malware (malicious software) has prospered under the collaborative effort of contributors, intrusion detection systems can also benefit from a boost in efficiency through collaboration.

This article describes the efforts of Dshield.org (http://www.dshield.org) to build a global, distributed intrusion detection system. Dshield.org assembles and analyzes detection log data from networks all over the world. In all, DShield.org processes one- to two-million events per day. DShield's users include many home users and some operators of large networks. This user base provides the DShield database with a roughly representative sample of network activity "in the wild". DShield studies the incoming data for unusual activity, producing a daily block list of sites that appear to be participating in Internet attacks, and notifying ISPs that their systems may have been compromised. See the sidebar for the principles of DShield.

The payoff of this system is already evident. Internet worms have been detected earlier, and the development of countermeasures has been accelerated by the early detection provided by DShield.org.

System Operation

DShield.org provides a number of "clients", which are small programs used to submit the logs. Typically, the client will parse the log at the user's machine and submit a log in a standardized format to DShield.org via email. Incoming email messages are queued, parsed, and a number of reports are generated showing current reported activity and trends over time.

A number of additional services that show the benefits of such a system are provided: a recommended "block list" of networks that are currently scanning other networks; the option to report sources of attacks to network operators; and the ability to find out whether other systems are being attacked by the same sources using the same methods.

The closest comparable service to Dshield.org is ARIS, operated by SecurityFocus (http://aris.securityfocus.com/). ARIS focuses on corporate users and offers extended reports and alerts as part of its "Predictor" service. Some firewall and intrusion detection companies also offer correlation software that providing similar functionality. This software, if implemented, will correlate all events collected by participants. However, participants are limited to organizations implementing this system and do not provide a global view.

Reports

The "Copernican Principle", known to astronomers and others, can be applied to many phenomena. This principal assumes that one's position as an observer in the universe is not special in any way, and therefore, one's observations can be considered representative. However, this principle is not considered valid for intrusion detection. Many threats target particular networks or prefer some networks over others. Therefore, it is critical to compare one's observations to others in order to draw the correct conclusions.

A distributed intrusion detection system (dIDS) provides an easy way of doing this. For example, DShield.org provides each user with customized reports detailing how many other users observed the same source or the same attack. In some instances, it will also link to more detailed descriptions of the nature of the attack. For the public, reports are generated showing the overall distribution of attacks by origin and type.

Two products of the DShield.org -- a proactive block list, and reports sent to administrators of systems implicated by reports ("Fightback") -- will be explained in more detail below. Most reports are consolidated at the "Internet Storm Center" (http://isc.dshield.org), which provides cross-linked reports to quickly analyze current events.

Geographic Attack Distribution

Early on, one problem was finding the geographic distribution of attack sources. It is an often-voiced opinion that Asia is contributing to attacks more than other geographic areas. This assumption is mostly validated by data collected so far. However, it appears that the number of attacks originating in Asia is as much a factor of vulnerable systems in these countries being used as relays as it is a factor of users in these areas instigating attacks. DShield.org recently started collaborating with the Korean Computer Emergency Response Coordination Center (KRCERTCC). A daily data feed summarizing the attacks that originate in Korea is used by the KRCERTCC to track an Internet Service Provider's (ISP) performance over time.

Attacks by Target Port

A first indication of the nature of an attack is the port targeted. While it is not conclusive to use the target port to identify an attack, shifts in port targets are indicative of shifts in attacks used to target networks. In several cases, these shifts have shown early in the outbreak of new attacks. Figure 1 shows the increase in port 80 reports as a result of the Code Red outbreak in July 2001. As early as July 13th, DShield.org provided an indication of a significant increase in port 80 attacks.

DShield developed a process to further follow up on this report. If a significant increase is detected, the user who originally submitted the report is contacted. Further information provided by these users (i.e., full packet logs or statements regarding network configuration) is analyzed. If there is reason to suspect a new attack, we attempt to capture the responsible code and issue a warning. This approach maintains the agility of the system, which is based on limited header information, and enables us to back up an alert with additional data, if necessary.

Attack Persistence

As an attacker scans large network blocks, a single target will not be able to ascertain whether the attack against it was a single "slip" (e.g., a user typing a wrong IP address), a targeted attack, or part of a widespread hunt for vulnerable systems. A collaborative system like DShield, however, can follow an attack source as it scans multiple networks. Figure 2 shows a graph of the persistence of attacks. The plot shows the time between the first and last attack reported to DShield.

Interestingly, the distribution can be explained by a statistical fit using the assumption that 99.5% of the systems are taken offline after an average of five hours ("half life time") and the remaining 0.5% will remain scanning for an average of five days. The function used for this fit is a sum of two exponential decays:

A (r1 *exp (- x*ln(2)/h1) + r2 * exp ( -x*ln(2)/h2) )

where A is the total number of infected machines in the beginning, r1 and r2 are the fraction, which are part of the slow and fast component (r1=0.95, r2=0.05), h1 and h2 are the half-life time, after which 1/2 of the infected machines are fixed (h1=five hours, h2=five days). (Later, we will describe our "fightback" program, which attempts to improve this ratio.)

Proactive Block List

A common use of intrusion detection systems is to assemble a list of "blocked" or "banned" IP addresses. For example, if an IDS monitoring a public Web server, which cannot block port 80 for incoming traffic globally, detects a large number of http intrusion attempts from a given network, it may decide to block future access to its system from this network. However, such a block can only be implemented after the scan is detected, which is usually too late.

Using a dIDS allows users to learn from attacks detected by others and build a proactive consensus-based block list. This list will include networks that have a recent history of being abused as attack sources. A regularly updated list allows network administrators to maximize the accessibility of their networks. Instead of blocking large IP blocks, they could focus on smaller networks based on evidence collected by others. Widespread implementation of such a block list may also force listed networks to become more proactive in eliminating malicious activity from the networks.

Currently, DShield generates a daily block list. It lists the top 20 attack sources for the previous three days. Instead of focusing on individual IP addresses, the list summarizes class C networks. The list also includes a number of reserved addresses, which are frequently used to spoof sources in a Distributed Denial of Service (DDoS) attack. The list is available via http and https at http://feeds.dshield.org/block.txt or https://secure.dshield.org/feeds/block.txt. The format is a simple tab-delimited format, which eases parsing by automated scripts. A PGP signature is provided at http://feeds.dshield.org/block.txt.asc. (See http://www.dshield.org/block_list_info.html for current information on using the block list.)

DShield.org manually reviews this block list and notifies the networks that are on the list so that they can preclude this behavior. The primary criteria for inclusion is the number of targets that have reported attacks from a listed network over the previous three days. The total number of different targets (rather than the total number of accesses) is a better indicator of danger because the attacks are attempting to infect or exploit a large number of machines. Even after a network has been added to the "blocked IP" list, attacks will continue. When implemented, this list can prevent users from being affected by the attacks.

As an example, DShield.org includes a script to generate iptables rules using this blocklist. Writing a script to automatically generate iptables rules from an online-retrieved list like this poses a number of challenges. First, the script requires root privileges in order to run. Second, you must carefully validate the retrieved content to avoid running afoul of altered scripts that may include wrong information intended to block access to valid users.

While using Perl's taint mode is a minimum requirement, the script also requires the use of digital signatures to validate the content. The sample script below uses the PGP signature provided by DShield. It assumes that the necessary keys are already present in the executing user's keyring. An alternative and simpler method is to utilize https, but many users do not have an https-capable version of the Perl LWP module installed, and it is easier to install Gnu Privacy Guard (GnuPG).

The script generates a separate chain called "BLOCKLIST". Using a new chain instead of adding the rule to an existing chain will ease maintenance and lessen the probability of its interfering with existing rules. The "BLOCKLIST" should be called from INPUT or FORWARD chains. A possible setup would look like this:

# allow trusted sources, which we never 
# want to lock out iptables -A INPUT -s 
# (...trusted ip...) 
# (..further restrictions, e.g. port..) -j 
# ALLOW call BLOCKLIST iptables -A INPUT 
# -j BLOCKLIST
# execute remainder of firewall rules
# iptables -A INPUT ....

The same sequence can be used for other chains, like forward chains. The Perl script in Listing 1 will retrieve the block list and add the rules to the BLOCKLIST. The relevant PGP public keys can be found at http://www.dshield.org/dshield_public_key.txt. You may want to define a small chain to log blocked accesses distinctively. For example, use a chain like:

$IPTABLES -N LOGBLOCK
$IPTABLES -A LOGBLOCK -j LOG --log-level warning --log-prefix "filter:
BLOCKLIST " $IPTABLES -A LOGBLOCK -j DROP

To use this new custom chain, change the following in Listing 1:

my $blocktarget='DROP'

to read:

my $blocktarget='LOGBLOCK'

Eliminating Attacks

It is important to notify administrators that machines under their control have been accessing other machines in a hostile manner. Administrators can then investigate the suspected machine to determine whether the accesses were caused by a user performing cracking activity or, more likely, by a compromised machine that is attempting to compromise other machines. The vast increase (compared to the days when only professional administrators maintained firewalls) in the number of firewall users causes administrators to be deluged with an amplified number of abuse reports.

One problem that can occur when individual users send abuse reports is that activity that could be considered hostile might actually have been caused by an innocent mistake, such as mistyping a URL. Therefore, if individual users send abuse reports, there is the danger that administrators will be flooded with abuse reports that are based on innocent mistakes.

A second potential problem caused by individual users submitting abuse reports is that most residential users of personal firewalls are not trained in security. Consequently, they may not know how to differentiate true hostile activity from "normal" network activity, such as DHCP (Dynamic Host Control Protocol) authentication.

A third problem is the lack of standardization for abuse reporting when individuals submit abuse reports. Each one is different, meaning that administrators receiving these reports must spend additional time studying them to determine whether the data is significant. A standard format abuse report allows administrators to quickly scan for relevant information. An even better solution would be to eliminate sending abuse reports by email altogether and replace them with more efficient summary reports tailored to an administrator's needs.

DShield attempts to alleviate these problems by encouraging its users to let DShield send the abuse reports. DShield-generated "Fightback" abuse reports are only sent after a summary report from their database showing that accesses from a given source IP fit certain criteria. These are:

1. From a port that we consider indicative of suspicious activity

2. Have been logged by a minimum number of separate target machines

3. Haven't been sent to the administrator for this IP in the past month

4. At least one of the submitting users agreed to have its reports forwarded

If, and only if, these criteria are met will DShield send a "FightBack" abuse report to the administrator of the network that controls the implicated source IP. The abuse report summarizes the suspected hostile access activity, giving log samples that show details of the suspected hostile accesses. A coded link is provided to a custom report describing the incident and showing all accesses linked to this source IP. This report includes accesses submitted to our database after the abuse message was sent, so that a concerned administrator can periodically check the database to see whether this machine has truly ceased the hostile activity.

For large networks or ISPs, DShield provides custom "bulk" abuse reports as an alternative to individual email abuse reports. These are worked out on a case-by-case basis with the network administrators.

Future Plans

With the potential of more users applying firewall rules and disabling unneeded services, intrusion attempts are more likely to focus on the few remaining critical business services still commonly exposed to the outside. As is already happening, more information will be required to distinguish different types of attacks. In the immediate future, collection of full packet content is planned from some users. This will shorten our response time, as we will have full packets for further real-time analysis.

Summary

Host- and network-based intrusion detection should be part of every administrator's defense of a network against targeted attacks. While individual IDSs are frequently criticized as being reactive and more useful for forensics instead of defense, joining them with a large-scale dIDS (such as DShield.org) will make them part of a proactive weapon in the administrator's arsenal.

Acknowledgements

DShield.org is currently supported by the SANS Institute. We would like to thank the numerous contributors and current as well as past cooperators. In particular, we'd like to thank Alan Paller, Stephen Northcutt, John Green, and Matt Fearnow for continued support of our activities.

Johannes Ullrich started DShield.org in November 2000. He joined the SANS Institute as CTO for the SANS Institute's Internet Storm Center in July 2001. Before that, he was employed as Lead Support Engineer by Banta Integrated Media.

Wayne Larmon is a computer consultant with more than 20 years of programming experience. He joined DShield.org shortly after its inception as a volunteer and is now a consultant responsible for client development.