A Responsible De-Identification Of The Real Data Corpus: Building A Framework For PII Management

Abstract

De-identification methods have helped government organizations provide the public with useful informationpromoting transparency and accountability while also protecting the individual privacy of the data subjects. However, due to the recent massive increase in data collection and improved methods of analysis, de-identification has become a more difficult task. This work outlines challenges and discusses procedures for making a potentially sensitive data set available to extramural researchers and institutions without significant risk to human subject privacy. We provide a detailed explanation of personally identifiable information to help us understand what forms of personally identifiable information can cause the most harm. Furthermore, we discuss the legality and ethics behind working with personally identifiable information to illustrate the importance of protecting privacy. We then offer a taxonomy of threats, vulnerabilities, and impacts and describe how these determine risk. Based on this taxonomy, we develop a framework to assess risk on the Real Data Corpus, a collection of forensic disk images containing personally identifiable information. In addition, we analyzed-identification methods such as pseudonymization and anonymization, and consider re-identification risks. Finally, we apply our framework and methodology to a real-world scenario to determine the risk of data disclosure to an extramural researcher.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2016
Accession Number
AD1029652

Entities

People

  • Johanna An

Organizations

  • Naval Postgraduate School

Tags

Communities of Interest

  • Biomedical
  • Cyber

DTIC Thesaurus Topics

  • Accuracy
  • Big Data
  • Computational Science
  • Computer Programs
  • Computers
  • Data Mining
  • Data Science
  • Data Sets
  • Electronic Mail
  • Employment
  • Information Processing
  • Information Science
  • Information Systems
  • Network Science
  • Operating Systems
  • Personnel Management
  • Vulnerability

Fields of Study

  • Computer science

Readers

  • Artificial Intelligence
  • Cybersecurity.
  • Government and Public Administration Law.