Increasing reproducibility, robustness, and generalizability of biomarker selection from meta-analysis using Bayesian methodology

Abstract

A major limitation of gene expression biomarker studies is that they are not reproducible as they simply do not generalize to larger, real-world, heterogeneous populations. Frequentist multi-cohort gene expression meta-analysis has been frequently used as a solution to this problem to identify biomarkers that are truly differentially expressed. However, the frequentist meta-analysis framework has its limitations–it needs at least 4–5 datasets with hundreds of samples, is prone to confounding from outliers and relies on multiple-hypothesis corrected p-values. To address these shortcomings, we have created a Bayesian meta-analysis framework for the analysis of gene expression data. Using real-world data from three different diseases, we show that the Bayesian method is more robust to outliers, creates more informative estimates of between-study heterogeneity, reduces the number of false positive and false negative biomarkers and selects more generalizable biomarkers with less data. We have compared the Bayesian framework to a previously published frequentist framework and have developed a publicly available R package for use.

Document Details

Document Type
Pub Defense Publication
Publication Date
Jun 27, 2022
Source ID
10.1371/journal.pcbi.1010260

Entities

People

  • Laurynas Kalesinskas
  • Purvesh Khatri
  • Sanjana Gupta

Organizations

  • Dr. Ralph and Marian Falk Medical Research Trust
  • Gates Foundation
  • National Institute of Allergy and Infectious Diseases
  • United States Department of Defense

Tags

Readers

  • Computational Modeling and Simulation
  • Prostate Cancer Biology.
  • Regression Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Neural Networks