Composable Robust Structured Data Inference
Abstract
Messy data heterogeneous values, missing entries, and large errors presents a major obstacle to automated data-driven discovery of models. Data cleaning is the first step in any data processing pipeline, and has serious consequences for the results of any subsequent analysis. Yet this step is generally performed using ad-hoc methods. This effort seeks to cleanse the data set, and build a structured data interface to reduce noise from data sets, to deliver a production of clean data sets, and leverage model selection and automated techniques.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 2021
- Accession Number
- AD1146477
Entities
People
- Madeleine Udell
Organizations
- Cornell University