Profile Based Direct Kernels for Remote Homology Detection and Fold Recognition

Abstract

Remote homology detection between protein sequences is a central problem in computational biology. Supervised learning algorithms based on support vector machines are currently the most effective method for remote homology detection. The performance of these methods depends on how the protein sequences are modeled and on the method used to compute the kernel function between them. We introduce new classes of kernel functions that are constructed by directly combining automatically generated sequence profiles with new and existing approaches for determining the similarity between pairs of protein sequences, which employ effective schemes for scoring the aligned profile positions. Experiments with remote homology detection and fold recognition problems show that these kernels are capable of producing results that are substantially better than those produced by all of the existing state-of-the-art SVM-based methods. In addition, the experiments show that these kernels, even when used in the absence of profiles, produce results that are better than those produced by existing nonprofile- based schemes.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 31, 2005
Accession Number
ADA439489

Entities

People

  • George Karypis
  • Huzefa Rangwala

Organizations

  • University of Minnesota

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Amino Acids
  • Classification
  • Computer Science
  • Detection
  • Experimental Design
  • Frequency
  • Information Operations
  • Kernel Functions
  • Military Research
  • Minnesota
  • Recognition
  • Sequences
  • Training

Fields of Study

  • Biology
  • Computer science

Readers

  • Neural Network Machine Learning.
  • Sensor Fusion and Tracking Systems.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks