Applying Signature Extraction and Classification Algorithms on Express on Profiles of CD Markers and Toll Like Receptors to Classify and Predict Exposures to Various Pathogens

Abstract

To classify a disease samples using high throughput genomic and proteomic data, it is essential to decide which toll like receptors and CDmarker should be included in a predictor list. Too few markers may not be enough to discriminate and classify an exposure. Having toomany Markers is not optimal either, as some of these markers may be irrelevant to the diagnosis and may reduce the information decisivefactor due to adding noise. Efforts are made to select an optimal set of targets for which to start the training of a set of predictors. This isaccomplished by a variety of means such as the neighborhood analysis (Golub et al 1999), principal component analysis (Khan et al 2000),and gene shaving (Hastie et al 2000).Various algorithms and tools are developed and described in the literature. These algorithms will serve as a foundation for the developmentof the statistical classification tool.We will examine these algorithms for best and optimal prediction model and feature extraction.Initially, data generated using cDNA microarrays will be processed, filtered and analyzed using in house data analysis tools. Expressionprofiles for the toll like receptors and CD markers for each pathogen at various time points will be extracted. These profiles will be used toidentify the markers that are good discriminators for certain pathogen at certain time point. In the process of analyzing the data, we considertwo assumptions: 1) The distribution of the gene intensities in a sample is normal and 2) A gene is a good discriminator if it is present at aconsistently high level in one class and absent or present at a consistently low level in the other class.To validate each list of predictors, we will use our database of gene expression as a training set and add some blinded samples to see whetherthese predictors are able to identify an exposure by analyzing the expression profiles of toll like receptors and CD markers correlated withthis exposure.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 10, 2016
Accession Number
AD1006699

Entities

People

  • Seid Muhie

Organizations

  • Georgetown University

Tags

Communities of Interest

  • Human Systems

DTIC Thesaurus Topics

  • Algorithms
  • Biological Warfare
  • Cells
  • Data Analysis
  • Department Of Defense
  • Dna Microarrays
  • Ebola Virus
  • Encephalitis
  • Engineering
  • Equine Encephalitis
  • Gene Expression
  • Gram-Negative Bacterial Infections
  • Liquid Chromatography
  • Macrophages
  • Medical Personnel
  • Students
  • Viruses

Readers

  • Forest Ecology
  • Molecular Genetics
  • Oncology and Biomarker-Based Cancer Detection.

Technology Areas

  • Biotechnology