Partial Least Squares Discriminant Analysis and Bayesian Networks for Metabolomic Prediction of Childhood Asthma

Abstract

To explore novel methods for the analysis of metabolomics data, we compared the ability of Partial Least Squares Discriminant Analysis (PLS-DA) and Bayesian networks (BN) to build predictive plasma metabolite models of age three asthma status in 411 three year olds (n = 59 cases and 352 controls) from the Vitamin D Antenatal Asthma Reduction Trial (VDAART) study. The standard PLS-DA approach had impressive accuracy for the prediction of age three asthma with an Area Under the Curve Convex Hull (AUCCH) of 81%. However, a permutation test indicated the possibility of overfitting. In contrast, a predictive Bayesian network including 42 metabolites had a significantly higher AUCCH of 92.1% (p for difference < 0.001), with no evidence that this accuracy was due to overfitting. Both models provided biologically informative insights into asthma; in particular, a role for dysregulated arginine metabolism and several exogenous metabolites that deserve further investigation as potential causative agents. As the BN model outperformed the PLS-DA model in both accuracy and decreased risk of overfitting, it may therefore represent a viable alternative to typical analytical approaches for the investigation of metabolomics data.

Document Details

Document Type
Pub Defense Publication
Publication Date
Oct 23, 2018
Source ID
10.3390/metabo8040068

Entities

People

  • Augusto A Litonjua
  • Jessica Lasky-Su
  • Kathleen Lee-Sarwar
  • Mengna Huang
  • Michael Mcgeachie
  • Priyadarshini Kachroo
  • Rachel S Kelly
  • Scott T Weiss
  • Su Chu
  • Yamini Virkud

Organizations

  • National Heart, Lung, and Blood Institute
  • National Institute of Allergy and Infectious Diseases
  • United States Department of Defense

Tags

Readers

  • Molecular and Cellular Biology
  • Neural Network Machine Learning.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Neural Networks