Fast and Cheap Genome wide Haplotype Construction via Optical Mapping

Abstract

We describe an efficient algorithm to construct genome wide haplotype restriction maps of an individual by aligning single molecule DNA fragments collected with Optical Mapping technology. Using this algorithm and small amount of genomic material, we can construct the parental haplotypes for each diploid chromosome for any individual, one from the father and the other from the mother. Since such haplotype maps reveal the polymorphisms due to single nucleotide differences (SNPs) and small insertions and deletions (RFLPs), they are useful in association studies, studies involving genomic instabilities in cancer, and genetics. For instance, such haplotype restriction maps of individuals in a population can be used in association studies to locate genes responsible for genetics diseases with relatively low cost and high throughput. If the underlying problem is formulated as a combinatorial optimization problem, it can be shown to be NP-complete (a special case of K-population problem). But by effectively exploiting the structure of the underlying error processes and using a novel analog of the Baum-Welch algorithm for HMM models, we devise a probabilistic algorithm with a time complexity that is linear in the number of makers. The algorithms were tested by constructing the first genome wide haplotype restriction map of the microbe T. Pseudoana, as well as constructing a haplotype restriction map of a 120 Megabase region of Human chromosome 4. The frequency of false positives and false negatives was estimated using simulated data. The empirical results were found very promising.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2004
Accession Number
ADA439802

Entities

People

  • Bhubaneswar Mishra
  • Thomas Anantharaman
  • Venkatesh Mysore

Organizations

  • New York University

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Chromosomes
  • Computational Complexity
  • Computational Science
  • Dynamic Programming
  • Equations
  • Genetic Phenomena
  • Genetic Structures
  • Genetics
  • Genome
  • Human Genome
  • Molecules
  • Probabilistic Models
  • Probability
  • Probability Distributions
  • Random Variables

Fields of Study

  • Biology

Readers

  • Molecular Genetics
  • Molecular and genetic basis of cancer.
  • Operations Research

Technology Areas

  • Biotechnology