Bayesian Robust Inference for Differential Gene Expression in cDNA Microarrays with Multiple Samples

Abstract

We consider the problem of identifying differentially expressed genes under different conditions using cDNA microarrays. Standard statistical methods cannot be used because typically there are thousands of genes and few replicates. Because of the many steps involved in the experimental process, from hybridization to image analysis, cDNA microarray data often contain outliers. For example, an outlying data value could occur because of scratches or dust on the surface, imperfections in the glass, or imperfections in the array production. We develop a robust Bayesian hierarchical model for testing for differential expression. Outliers are modeled explicitly using a t-distribution. The model includes an exchangeable prior for the variances which allow different variances for the genes but still shrink extreme empirical variances. Our model can be used for testing for differentially expressed genes among multiple samples, and can distinguish between the different possible patterns of differential expression when there are three or more samples. Parameter estimation is carried out using a novel version of Markov Chain Monte Carlo that is appropriate when the model puts mass on subspaces of the full parameter space. The method is illustrated using two publicly available gene expression data sets. We compare our method to five other commonly used techniques, namely the one-sample t-test, the Bonferroni-adjusted t-test, Significance Analysis of Microarrays (SAM), and EBarrays in both its Lognormal-Normal and Gamma-Gamma forms. In an experiment with HIV data, our method performed better than these alternatives, on the basis of between-replicate agreement and disagreement.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jul 05, 2004
Accession Number
ADA478831

Entities

People

  • Adrian Raftery
  • Ka Y. Yeung
  • Raphael Gottardo
  • Roger E. Bumgarner

Organizations

  • University of Washington

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Bayesian Networks
  • Computational Science
  • Data Science
  • Data Sets
  • Dna Microarrays
  • Gene Expression
  • Information Science
  • Lymphocytes
  • Markov Chains
  • Monte Carlo Method
  • Probability
  • Standards
  • Statistical Algorithms
  • Statistical Analysis
  • Statistical Inference
  • Statistical Tests
  • Statistics

Fields of Study

  • Mathematics

Readers

  • Computational Modeling and Simulation
  • Oncology and Biomarker-Based Cancer Detection.
  • Statistical inference.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • Space