Creating Realistic Corpora for Security and Forensic Education

Abstract

We present work on the design, implementation, distribution, and use of realistic forensic datasets to support digital forensics and security education. We describe in particular the "M57-Patents" scenario a multi-modal corpus consisting of hard drive images, RAM images, network captures, and images from other devices typically found in forensics investigations such as USB drives and cellphones. Corpus creation has been performed as part of a scripted scenario; subsequently it is less "noisy" than real-world data but retains the complexity necessary to support a wide variety of forensic education activities. Realistic forensic corpora allow direct comparison of approaches and tools across classrooms and institutions, reduce the time required to prepare useful educational materials, and eliminate concerns of exposing students to privacy-sensitive or illegal digital materials. The "M57- Patents" corpus can be freely redistributed without rights-restricted materials, and is available with disk images packaged in both open (Advanced Forensic Format) and commercial (EnCase) formats.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 2011
Accession Number
ADA549432

Entities

People

  • Adam Russell
  • Christopher A. Lee
  • David Dittrich
  • Kam Woods
  • Kris Kearton
  • Simson Garfinkel

Organizations

  • Naval Postgraduate School

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Computational Forensics
  • Computer Crime
  • Computer Programs
  • Computer Science
  • Computers
  • Cryptography
  • Data Sets
  • Education
  • Information Operations
  • Instructors
  • Law
  • Materials
  • Operating Systems
  • Security
  • Social Media
  • Students
  • Validation

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Cybersecurity.
  • Electrical Engineering