Machine Learning in the Presence of an Adversary: Attacking and Defending the SpamBayes Spam Filter

Abstract

Machine learning techniques are often used for decision making in security critical applications such as intrusion detection and spam filtering. However, much of the security analysis surrounding learning algorithms is theoretical. This thesis provides a practical evaluation of the algorithms used by SpamBayes, a statistical spam filter, to determine its ability to correctly distinguish spam email for normal email when learning in the presence of an adversary. This thesis presents both attacks against SpamBayes and defenses against these attacks. The attacks are able to subvert the spam filter by both causing a high percentage of false positives and false negatives. With only a 100 attack emails, out of an initial training corpus of 10,000, the spam filter's performance is sufficiently degraded to either cause a denial of service attack or successfully allow spam emails to bypass the filter. The defenses shown in this thesis are able to work against the attacks developed against SpamBayes and are sufficiently generic to be easily extended into other statistical machine learning algorithms.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 20, 2008
Accession Number
ADA518825

Entities

People

  • Udam Saini

Organizations

  • University of California, Berkeley

Tags

Communities of Interest

  • Autonomy
  • Cyber

DTIC Thesaurus Topics

  • Algorithms
  • California
  • Computer Science
  • Denial Of Service Attack
  • Detection
  • Detectors
  • Electrical Engineering
  • Electronic Mail
  • Engineering
  • Intrusion
  • Intrusion Detection
  • Intrusion Detection Systems
  • Intrusion Detectors
  • Learning
  • Machine Learning
  • Probability
  • Test Sets

Fields of Study

  • Computer science

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Combustion science or combustion engineering.
  • Cybersecurity.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks