Machine Learning Algorithms for Statistical Patterns in Large Data Sets
Abstract
Modern data analysis operations are continuously flooded with streams of noisy, incomplete, and sometimes intentionally misleading data. Traditional analysis methods cannot scale to handle these issues. We developed a battery of new, efficient, parallel, statistical machine learning algorithms to push the boundaries of machine learning capabilities under these circumstances. We have made much of our mature algorithms available as open source tools and published in peer-reviewed academic journals and conferences. The algorithms cover a wide range of learning applications, but all rest on strong statistical foundations and in that sense that they all speak the same language. We have provided theoretical guarantees and proofs were possible and demonstrated the value of our algorithms on many interesting problems.
Document Details
- Document Type
- Technical Report
- Publication Date
- Feb 01, 2018
- Accession Number
- AD1048823
Entities
People
- Arthur Dubrawski
Organizations
- Carnegie Mellon University