Privacy-Preserving and Robust Federated Learning

Abstract

Approved for Public Release. Problem Description: The proliferation of big and complex data has led to new developments and widespread use of machine learning models, especially ``deep learning models. Their model complexity and expressiveness result in non-linear boundaries. This has led to the huge success of deep networks and their adoption in areas such as computer vision and natural language processing. The successful learning of these high-performance models often relies on demanding training datasets that are large-scale in sample size, high-fidelity in data quality, and has wide coverage in data distribution. Moreover, in many practical applications of machine learning, learning involves highly sensitive data from distributedly stored locations. For example, in face recognition systems, the learning of face recognition model uses data from a large number of users, whose face images are often stored locally on their mobile devices, and users rarely agree for their data to betransferred outside of their devices. Another common scenario is medical research, where the predictive modeling of diseases can largely benefit from patients medical records from multiple health institutions, and sharing health data is mostly prohibitive due to compliance regulations such as HIPPA and GDPR. In the decision-making domains of interest ONR, high-performance deep learning models are desired to capture complicated patterns in data. Still, the information for training and inference on the prediction may rely on data from different units that have strict data sharing policies. These challenges impose a strong need for the machine learning paradigm federated learning, where a set of distributed learning clients collaboratively learn a model with data confidentiality. Proposed Approach: Over the past few years, the machine learning community has developed principled algorithmic frameworks for federated learning, studied their convergence behavior, and proposed many variants for dealing with heterogeneity during learning. However, there are still major challenges that prevent federated learning from being deployed in real practice. First, the data confidentiality in federated learning does not secure privacy leakage since gradients or updated model parameters from clients can be used to conduct inversion attacks. Second, the non-iid data in federated learning due to data heterogeneity or device heterogeneity may lead to unstable learning trajectories and suboptimal models.Third, the adversarial robustness desired by many sensitive applications would impose significant overhead on participating clients, which is not always possible due to resource constraints on client devices. This proposed research provides a unified solution to privacy-preserving and robust federated learning that jointly solves the aforementioned challenges. To enable the privacy notion in federated learning, we propose a novel differential private mechanism that allows multiple clients to collaboratively build models with minimal information sharing, with proven privacy guarantees. To address issues from imbalanced data, we propose a subspace-guided federated learning strategy so biases from clients with long-tail data can be regulated from impacting the global model. To allow robustness in all participating devices, we propose a novel robustness transfer framework, so robustness from clients with higher computing power can share robustness with the clients that cannot afford adversarial training.

Document Details

Document Type
DoD Grant Award
Publication Date
Mar 08, 2024
Source ID
N000142412168

Entities

People

  • Anil K. Jain

Organizations

  • Michigan State University
  • Office of Naval Research
  • United States Navy

Tags

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Neural Network Machine Learning.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Neural Networks