PAVE: Write-print Creation with MapReduce

Abstract

Cyber-crime is becoming alarmingly common through the use of anonymous e-mails. Author attribution helps digital forensics investigators filter through a large set of possible authors and focus traditional investigative techniques on the most probable culprits. A recent promising technique is to construct a write-print for each known author and compare it to the write-print extracted from the anonymous message(s). A write-print is a unique digital fingerprint created by mining frequent patterns from a particular authors writing style. Parallel computing enables us to leverage multiple cores in the creation of author write-prints. We introduce Parallel Author Verification of E-mail (PAVE), a MapReduce algorithm for generating author write-prints in parallel. Our algorithm is able to achieve up to 90 accuracy when tested on a subset of the Enron dataset. We believe the community will find the PAVE system useful to expedite author identification in time sensitive situations.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 2015
Accession Number
AD1005367

Entities

People

  • Alexander Molnar
  • Andreas Kellas
  • Frederick Ulrich
  • Leo St. Amour
  • Suzanne J. Matthews

Organizations

  • United States Military Academy

Tags

Communities of Interest

  • Cyber
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Accuracy
  • Algorithms
  • Computer Science
  • Data Mining
  • Data Sets
  • Directories
  • Electronic Mail
  • Experimental Design
  • Feature Selection
  • Frequency
  • High Performance Computing
  • Identification
  • Online Communications
  • Parallel Computing
  • Training
  • United States
  • United States Military Academy

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Computer Vision.
  • Systems Analysis and Design

Technology Areas

  • Cyber