QuartetS-DB: A Large-Scale Orthology Database for Prokaryotes and Eukaryotes Inferred by Evolutionary Evidence

Abstract

Background: The concept of orthology is key to decoding evolutionary relationships among genes across different species using comparative genomics. QuartetS is a recently reported algorithm for large-scale orthology detection. Based on the well-established evolutionary principle that gene duplication events discriminate paralogous from orthologous genes, QuartetS has been shown to improve orthology detection accuracy while maintaining computational efficiency. Description: QuartetS-DB is a new orthology database constructed using the QuartetS algorithm. The database provides orthology predictions among 1621 complete genomes (1365 bacterial, 92 archaeal, and 164 eukaryotic) covering more than seven million proteins and four million pairwise orthologs. It is a major source of orthologous groups, containing more than 300,000 groups of orthologous proteins and 236,000 corresponding gene trees. The database also provides over 500,000 groups of inparalogs. In addition to its size, a distinguishing feature of QuartetS-DB is the ability to allow users to select a cutoff value that modulates the balance between prediction accuracy and coverage of the retrieved pairwise orthologs. The database is accessible at https://applications. bioanalysis.org/quartetsdb. Conclusions: QuartetS-DB is one of the largest orthology resources available to date. Because its orthology predictions are underpinned by evolutionary evidence obtained from sequenced genomes, we expect its accuracy to continue to increase in future releases as the genomes of additional species are sequenced.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2012
Accession Number
ADA572929

Entities

People

  • Chenggang Yu
  • Jaques Reifman
  • Li Cheng
  • Valmik Desai

Organizations

  • United States Army Medical Research and Development Command

Tags

DTIC Thesaurus Topics

  • Accuracy
  • Algorithms
  • Application Software
  • Biological Sciences
  • Computational Biology
  • Databases
  • Detection
  • Efficiency
  • Eukaryotes
  • Fungi
  • Genetics
  • High Performance Computing
  • Microbiology
  • Prokaryotes
  • Proteins
  • Web Browsers
  • Websites

Fields of Study

  • Biology

Readers

  • Molecular Genetics

Technology Areas

  • AI & ML