Mixture of Gaussian Models for Classification and Hypothesis Testing under Differential Privacy
Abstract
Gaussian mixture models are an important tool in Bayesian decision theory. In this study, we focus on building such models over statistical database protected under differential privacy. Our approach involves querying necessary statistics from a database, and using the noise added responses generated according to differential privacy in classification and hypothesis test. We first formally analyze the sensitivity of our query set. Since there are multiple methods to query a statistic, either directly or indirectly, we analyze the sensitivities for different querying methods. We discover that adding Laplace noises may become problematic. For example variance-covariance matrix after noise addition is no longer positive definite. We propose a heuristic algorithm to repair the noise added variance-covariance matrix. We then examine the Bayes error under differential privacy through experiments with both simulated data and real life data, and demonstrate under which condition the impact of the added noises can be reduced. We compute the type I and type II errors under differential privacy for one sample z test, one sample t test, and two sample t test with equal variances, and show when a hypothesis test becomes unreliable under differential privacy mechanism.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2012
- Accession Number
- AD1191475
Entities
People
- Ali Inan
- Bowei Xi
- Murat Kantarcıoğlu
- Xiaosu Tong
Organizations
- University of Texas at Dallas