Constructing and Classifying Email Networks from Raw Forensic Images
Abstract
Email addresses extracted from secondary storage devices are important to a forensic analyst when conducting an investigation. They can provide insight into the users social network and help identify other potential persons of interest. However, a large portion of the email addresses from any given device are artifacts of installed software and are of no interest to the analyst. We propose a method for discovering relevant email addresses by creating graphs consisting of extracted email addresses along with their byte-offset location in storage. We compute certain global attributes of these graphs to construct feature vectors, which we use to classify graphs into useful "and not useful categories. This process filters out the majority of uninteresting email addresses. We show that using the network topological measures on the dataset tested, Nave Bayes and SVM were successful in identifying 100% and 95:5%, respectively, of all graphs that contained useful email addresses both with areas under the curve above :97 and F1 scores at :80 and :90 for Nave Bayes and SVM, respectively. Our results show that using network science metrics as attributes to classify graphs of email addresses based on the graphs topology could be an effective and efficient tool for automatically delivering evidence to an analyst.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 2016
- Accession Number
- AD1029649
Entities
People
- Gregory R. Allen
Organizations
- Naval Postgraduate School