Making Differentially Oblivious Algorithms Practical

Abstract

Research problem. Machine learning and data mining techniques allow us to automatically discover new patterns and gain intelligence,, often by combining various data sources. Since many data sources are sensitive (e.g., location traces, telemetry sensor data, genom,ic data, and health records), an important challenge is how to achieve privacy-preserving computation and secure information sharing,. There are two broad classes of approaches towards privacy-preserving computation, relying either on trusted hardware, or on crypto,graphic techniques for computing on encrypted data. No matter which approach is taken, it is well-known that encryption and the abil,ity to compute on encrypted data alone are not sufficient for privacy. Access patterns to even encrypted data can leak highly sensit,ive information, potentially leading to reconstruction of the entire sensitive dataset. Earlier, a line of research on oblivious alg,orithms focuses on how to provably obfuscate a program s access patterns, such that no information whatsoever is leaked. Although in, recent years (and partly due to our own prior work), there has been a significant leap in the design of oblivious algorithms, a wel,l-known lower bound suggests that oblivious algorithms in the most ,ive to the insecure baseline. In some applications, such logarithmic slowdown may be undesirable.Objectives and technical approaches,. In the proposed work, we will explore new techniques for achieving access pattern privacy, with little to no overhead. We will con,sider a new, relaxed notion of privacy called differential obliviousness, where we apply the well-accepted differential privacy noti,on to access patterns. In our preliminary work, we proved encouraging theoretical results, showing that differential obliviousness o,ften allows us to overcome some of the fundamental barriers related to full obliviousness. Inspired by the theoretical groundwork, t,he proposed work will take an application-driven approach, and aim to make differential obliviousness practical. Anticipated outcome,. Anticipated outcomes include the following --- software artifacts and publications will be made publicly available through open so,urce. First, we will design and implement a differentially oblivious database called ADORE (short for "A Differentially Oblivious Re,lational Engine"). ADORE will enjoy provable security, and will achieve at least an order of magnitude speedup relative to existing,well-known encrypted and oblivious databases. Second, we will propose a new theory of composition for differential obliviousness. We, will apply this new theory to ADORE, and design a query planner that is aware of the privacy budget and optimizes the consumption o,f privacy budget over time. Third, we will apply differential obliviousness to a federated learning scenario where an untrusted serv,er aims to learn some statistics or train some machine learning model, over data collected fromthousands to millions of user devices,.Impact on DoD capabilities. Our approach relies on randomization techniques to provide provably secure decoy, such that an attacker, cannot glean sensitive information during data sharing and distributed computation. Our approach defends against adversaries even w,ith strong capabilities, e.g., attackers that may even have physical access to the computing infrastructure, or attackers who have i,mplanted malware on the computing infrastructure. We take a principled approach that achieves security by construction and gives pro,vable mathematical guarantees, thus fundamentally breaking away from the traditional "build-it-break-it" cycle.

Document Details

Document Type
DoD Grant Award
Publication Date
Jan 14, 2022
Source ID
N000142212064

Entities

People

  • Elaine Shi

Organizations

  • Carnegie Mellon University
  • Office of Naval Research
  • United States Navy

Tags

Fields of Study

  • Computer science
  • Mathematics

Readers

  • Cybersecurity.
  • Neural Network Machine Learning.
  • Parallel and Distributed Computing.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • Cyber
  • Cyber - Cryptography