Statistical Method and Theory for Privacy and Fairness in Trustworthy Artificial Intelligence

Abstract

Project SummaryApproved for Public ReleaseResearch problem and objective Trustworthy AI problems nowadays arise from a wide range of, industries (mission critical or not) such as Financial and Healthcare industries (medical institutions want to collaborate without,concerns on data privacy), AI hiring (female job applicants are unfairly treated in video interviews), and Auto-driving (insufficien,t data from corner cases in training environments). Therefore, it is not surprising that governments have announced stricter and str,icter regulations on AI such as the famous ``General Data Protection Regulation in EU. The next generation of artificial intellige,nce should be driven by trustworthiness, beyond performance. This will lead to paradigm shift in methodological and theoretical stud,ies of AI. Technical approachesThe theme of this proposal is to build trustworthy AI systems using either data-centric or algorithm-,centric solutions. Our proposal consists of three projects exploring privacy and fairness aspects of trustworthy AI. The first two p,rojects deal with differential privacy achieved by either traditional algorithmic approaches or modern data-centric approaches, whil,e the last project develops a theoretical benchmark for fair classification, together with user friendly algorithms. In this proposa,l, several statistical models are considered including linear regression, nonparametric regression/classification and deep neural ne,tworks. With a rapid development of machine learning, plentiful information can be predicted from massive data. Meanwhile, data priv,acy has drawn ,ramework of DP, we focus on an important but much less studied scenario that datasets need to be partially privatized. In this scena,rio, the conventional privacy-preserving approaches, such as noise injection and shuffling, will no longer work. To this end, we pro,pose a series of algorithmic solutions in Project 1.To complement algorithmic solutions, we consider protecting privacy using a data,-centric approach, i.e., synthetic data generation, in Project 2. We?ll produce artificially created data sets that remove individua,l information but still retain similar statistical information as the raw data sets. Despite numerous synthesis algorithms, we still, lack a theoretical understanding of how the generation of synthetic data affects the utility of downstream machine learning tasks.,This motivates us to develop statistical learning framework for the analysis of synthetic data. Machine learning algorithms are wide,ly integrated into high-stakes decision making processes, such as in job application and criminal prediction. However, empirical stu,dies have shown that most existing algorithms focus on performance, retaining or even amplifying implicit unfairness in historical d,ata. There are growing ethical concerns on the machine learning algorithms, and official institutions and organizations advocate con,sidering fairness in AI practice. The last project is devoted to establish a theoretical benchmark for fair classification algorithm,s, based on which user friendly and large scale algorithms are developed with guaranteed statistical optimality.Anticipated outcomes,We will develop user-friendly publicly available software as Tensorflow library or PyTorch. Their performance will be thoroughly eva,luated using different real-world data sets. We will summarize our findings in publications.Impact on DoD capacitiesThe U.S. Dept. o, privacy and civil liberties. The proposed research into the theoretical foundation of privacy and fairness in AI ensures a trusted,AI ecosystem in ONR.

Document Details

Document Type
DoD Grant Award
Publication Date
Sep 08, 2022
Source ID
N000142212680

Entities

People

  • Guang Cheng

Organizations

  • Office of Naval Research
  • United States Navy
  • University of California, Los Angeles

Tags

Fields of Study

  • Computer science

Readers

  • Distributed Systems and Data Platform Development
  • Government and Public Administration Law.
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - DoD AI Strategy
  • AI & ML - Neural Networks