Robust Metrics for Measuring Privacy Leakage and Data Provenance from Machine Learning Models
Abstract
Approved for Public Release:Large foundation models have shown immense promise in transforming how we interact with the world through large language models and AI art models. However, a major barrier to their increased development and use is that they are all trained on large, barely-curated datasets crawled from the internet, which may sometimes inadvertently contain sensitive content or copyrighted material. This, coupled with the fact many foundation models are known to memorize and regurgitate their inputs, can lead to major problems with their use. Currently, the way things work is that these models are thoroughly audited or "red-teamed" using certain privacy and memorization metrics before release. However, as our recent preliminary work has shown, these metrics are not veryrobust -- they can be readily manipulated by an incentivized model-owner, and they do not behave well on out-of-distribution inputs. The goal of this project is to first, investigate these robustness issues in more detail, and second, design more robust metrics that address these challenges. We will do so mainly by building on our prior work on forgeability of model training paths, as well as, looking at groups of individuals or features that may lead to more meaningful solutions.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- May 15, 2024
- Source ID
- N000142412304
Entities
People
- Kamalika Chaudhuri
Organizations
- Office of Naval Research
- United States Navy
- University of California, San Diego