Machine Learning Algorithms for Statistical Patterns in Large Data Sets

Abstract

Modern data analysis operations are continuously flooded with streams of noisy, incomplete, and sometimes intentionally misleading data. Traditional analysis methods cannot scale to handle these issues. We developed a battery of new, efficient, parallel, statistical machine learning algorithms to push the boundaries of machine learning capabilities under these circumstances. We have made much of our mature algorithms available as open source tools and published in peer-reviewed academic journals and conferences. The algorithms cover a wide range of learning applications, but all rest on strong statistical foundations and in that sense that they all speak the same language. We have provided theoretical guarantees and proofs were possible and demonstrated the value of our algorithms on many interesting problems.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 2018
Accession Number
AD1048823

Entities

People

  • Arthur Dubrawski

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Autonomy
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Artificial Intelligence Software
  • Coding
  • Computational Science
  • Computer Languages
  • Computer Programming
  • Computer Vision
  • Computers
  • Data Mining
  • Data Science
  • Digital Data
  • Digital Information
  • Identities
  • Information Processing
  • Information Science
  • Information Systems
  • Kernel Functions
  • Machine Learning
  • Mathematics
  • Metadata
  • Neural Networks
  • Notation
  • Supervised Machine Learning

Fields of Study

  • Computer science

Readers

  • Neural Network Machine Learning.
  • Systems Analysis and Design
  • Technical Research and Report Writing.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms