More Powerful Discriminants for Classifying Phylogenetic Signals in Dinucleotide Frequencies

Abstract

Microbial DNA fragments are classified according to species using compositional features and "genomic signatures" the oldest of which is the dinucleotide relative abundance profile defined by Karlin et al. More informative features, including higher order signatures, have demonstrated greater species-specificity in comparison to the baseline established by the dinucleotide signature using "delta-distance" to assess dissimilarity; but lack of standard methods has precluded rigorous comparison. We describe a new method for classifier evaluation that reduces any number of pair-wise inter-genomic comparisons to a single performance measure. To illustrate the method, we compare delta-distance to quadratic and linear discriminants prescribed by elementary pattern recognition theory, and find that the quadratic form is significantly more powerful.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2008
Accession Number
ADA504359

Entities

People

  • Changwon Jeon
  • David K. Han
  • Hanseok Ko
  • Robert H. Baran

Organizations

  • Korea University

Tags

Communities of Interest

  • Advanced Electronics

DTIC Thesaurus Topics

  • Abstracts
  • Classification
  • Data Science
  • Databases
  • Detectors
  • Engineering
  • Errors
  • Frequency
  • Genetic Code
  • Information Science
  • Machine Learning
  • Probability
  • Random Variables
  • Sequences
  • Signal Detection
  • Signal Processing
  • Standards

Fields of Study

  • Biology

Readers

  • Microbial Pathology
  • Oncology and Biomarker-Based Cancer Detection.
  • Regression Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • Biotechnology