Unique Signature Detection Program

Abstract

An approach to quantitatively describe the relationship between the volatile organic chemical profile associated with human emanations, as measured by gas chromatography/mass spectrometry, and genetic composition, specifically the HLA complex, was developed. To reduce random noise in the analysis, variable selection was carried out. Subsequently, two statistical analysis methods were evaluated and used to further eliminate those elements that did not significantly contribute to the distinction of the genotypes. These methods were: analysis of variance and Stepwise Linear Discriminant Analysis (SLDA). Not surprisingly, the latter was found to be superior because it takes into account the data covariance structure. For classification of the chemical profiles, several linear and nonlinear discriminant analyses were evaluated. These results showed that Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM) are preferred among the linear and nonlinear classification methods, respectively. RTI methods successfully classified individual samples into the correct genotype 90% of the time when the number of genotypes was relatively small (less than 10). The recommended approach is SLDA to select important components followed by LDA when the sample size is small and SVM when the sample size is moderate or large for classification purposes.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 09, 2005
Accession Number
ADA430955

Entities

People

  • James H. Raymer
  • Jun Liu
  • Larry Michael
  • Shiying Wu
  • Ye Hu

Organizations

  • RTI International

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Alcohols
  • Alkanes
  • Analysis Of Variance
  • Application Software
  • Chemical Compounds
  • Chemical Synthesis
  • Chemistry
  • Cyclic Hydrocarbons
  • Data Analysis
  • Detection
  • Dimensionality Reduction
  • Information Science
  • Operating Systems
  • Organic Chemistry
  • Spreadsheet Software
  • Statistical Analysis
  • Supervised Machine Learning

Readers

  • Analytical Chemistry
  • Molecular Genetics
  • Regression Analysis.

Technology Areas

  • AI & ML
  • Biotechnology