Unique Signature Detection Program
Abstract
An approach to quantitatively describe the relationship between the volatile organic chemical profile associated with human emanations, as measured by gas chromatography/mass spectrometry, and genetic composition, specifically the HLA complex, was developed. To reduce random noise in the analysis, variable selection was carried out. Subsequently, two statistical analysis methods were evaluated and used to further eliminate those elements that did not significantly contribute to the distinction of the genotypes. These methods were: analysis of variance and Stepwise Linear Discriminant Analysis (SLDA). Not surprisingly, the latter was found to be superior because it takes into account the data covariance structure. For classification of the chemical profiles, several linear and nonlinear discriminant analyses were evaluated. These results showed that Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM) are preferred among the linear and nonlinear classification methods, respectively. RTI methods successfully classified individual samples into the correct genotype 90% of the time when the number of genotypes was relatively small (less than 10). The recommended approach is SLDA to select important components followed by LDA when the sample size is small and SVM when the sample size is moderate or large for classification purposes.
Document Details
- Document Type
- Technical Report
- Publication Date
- Mar 09, 2005
- Accession Number
- ADA430955
Entities
People
- James H. Raymer
- Jun Liu
- Larry Michael
- Shiying Wu
- Ye Hu
Organizations
- RTI International