Statistical Sensitive Data Protection and Inference Prevention with Decision Tree Methods
Abstract
We present a new approach for protecting sensitive data in a relational table (columns: attributes; rows: records). If sensitive data can be inferred by unauthorized users with non-sensitive data, we have the inference problem. We consider inference as correct classification and approach it with decision tree methods. As in our previous work, sensitive data are viewed as classes of those test data and non-sensitive data are the rest attribute values. In general, however, sensitive data may not be associated with one attribute (i.e., the class), but are distributed among many attributes. We present a generalized decision tree method for distributed sensitive data. This method takes in turn each attribute as the class and analyze the corresponding classification error. Attribute values that maximize an integrated error measure are selected for modification. Our analysis shows that modified attribute values can be restored and hence, sensitive data are not securely protected. This result implies that modified values must themselves be subjected to protection. We present methods for this ramified protection problem and also discuss other statistical attacks.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2003
- Accession Number
- ADA465138
Entities
People
- Liwu Chang
Organizations
- United States Naval Research Laboratory