Statistical Characterization of Protein Ensembles (PREPRINT)

Abstract

When accounting for structural fluctuations or measurement errors, a single rigid structure may not be sufficient to represent a protein. One approach to solve this problem is to represent the possible conformations as a discrete set of observed conformations, an ensemble. In this work, we follow a different richer approach, and introduce a framework for estimating probability density functions in very high dimensions, and then apply it to represent ensembles of folded proteins. This proposed approach combines techniques such as kernel density estimation, maximum likelihood, cross-validation, and bootstrapping. We present the underlying theoretical and computational framework and apply it to artificial data and protein ensembles obtained from molecular dynamics simulations, and compare the results with those obtained experimentally, illustrating the potential and advantages of this representation.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 2006
Accession Number
ADA478740

Entities

People

  • Diego Rother
  • Guillermo Sapiro
  • Vijay Pande

Organizations

  • University of Minnesota

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Algorithms
  • Computational Science
  • Computations
  • Equations
  • High Resolution
  • Information Science
  • Information Theory
  • Mathematics
  • Measurement
  • Probability
  • Probability Density Functions
  • Probability Distributions
  • Random Variables
  • Simulations
  • Statistics
  • Three Dimensional
  • Validation

Fields of Study

  • Computer science

Readers

  • Molecular and Cellular Biochemistry
  • Quantum spin resonance or Electron Paramagnetic Resonance spectroscopy.
  • Regression Analysis.