Novelty Discovery with Heterogeneous Features: Cross-Feature Analysis for Database Intrusion Detection Systems
Abstract
This paper presents experiments with a unique machine learning method called Cross-Feature Analysis, which is a novelty discovery method that can easily accommodate heterogeneous features. The domain of our work is database security, with the goal of detecting attacks that are similar to those seen in the past as well as completely novel attacks that have not yet been seen. The training data consists of database logs that have no attacks, so supervised machine learning methods cannot apply, and unsupervised machine learning methods are unsatisfactory, because we have a variety of feature types, including numerical features, categorical features, and set-valued features. However, Cross-Feature Analysis transforms our novelty discovery problem into multiple supervised machine learning problems, building one submodel for each feature by treating that feature as the class, Then new instances are analyzed by the submodels to determine whether they are consistent (legitimate) or anomalous (suspicious). In our experiments we discovered that, by setting a limit on the number of submodels that reject an instance, our system can distinguish legitimate instances from attacks with perfect (100 ) recall of real attacks and a specificity of 99.9 on legitimate instances for one data set, and on another data set, recall = 97.2 and specificity = 99.9 .
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2010
- Accession Number
- AD1108478
Entities
People
- Adriane Chapman
- David Moore
- Erik Sax
- Irina Vayndiner
- Ken Samuel
- Peter Mork
Organizations
- MITRE Corporation