Everything You Wanted to Know About Blacklists But Were Afraid to Ask
Abstract
This document compares the contents of 25 different common public-internet blacklists in order to discover any patterns in the shared entries. Some lists contain IP addresses, and other lists contain domain names; these types of lists form the two cohorts that are compared. The contents of the lists are compared directly. The contents are also expanded to closely related identifiers using a passive DNS data source, and these expanded contents are also expanded. The list contents are also compared temporally to determine which, if either, list consistently provided any shared indicators before another list. The results demonstrate that most of the time, list contents are unique. There is surprisingly little overlap between any two blacklists. Though there are exceptions to this pattern, the intersection between the lists in general remains low even after expanding each list to a larger neighborhood of related indicators. The results also show that some lists do consistently provide content before certain other lists, but more often there is no intersection in the first place. When there is intersection, there is often no pattern to which list came first. These results suggest that each blacklist is describing a distinct sort of malicious activity. The lists do not appear to converge on one version of all the malicious indicators for the internet-at-large. Network defenders would be advised, therefore, to obtain and evaluate as many lists as practical, since it does not appear that any new list can be rejected out-of-hand as redundant.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 2013
- Accession Number
- AD1180386
Entities
People
- Jonathan M. Spring
- Leigh Metcalf
Organizations
- Carnegie Mellon University