Calculating Site-Specific Evolutionary Rates at the Amino-Acid or Codon Level Yields Similar Rate Estimates
Abstract
Site-specific evolutionary rates can be estimated from codon sequences or from amino-acid sequences. For codon sequences, the most popular methods use some variation of the dN=dS ratio. For amino-acid sequences, one widely-used method is called Rate4Site,and it assigns a relative conservation score to each site in an alignment. How site-wise dN=dS values relate to Rate4Site scores is not known. Here we elucidate the relationship between these two rate measurements. We simulate sequences with known dN=ds, using either dN=dS models or mutation selection models for simulation. We then infer Rate4Site scores on the simulated alignments, and we compare those scores to either true or inferred dN=dS values on the same alignments. We find that Rate4Sitescores generally correlate well with true dN=dS, and the correlation strengths increase in alignments with greater sequence divergence and more taxa. Moreover, Rate4Sitescores correlate very well with inferred (as opposed to true) dN=dS values, even for small alignments with little divergence. Finally, we verify this relationship betweenRate4Site and dN=dS in a variety of empirical datasets. We conclude that codon-level and amino-acid-level analysis frameworks are directly comparable and yield very similar inferences.
Document Details
- Document Type
- Technical Report
- Publication Date
- May 30, 2017
- Accession Number
- AD1057629
Entities
People
- Claus O. Wilke
- Dariya K. Sydykova