Machine learning predicts new anti-CRISPR proteins

Abstract

The increasing use of CRISPR–Cas9 in medicine, agriculture, and synthetic biology has accelerated the drive to discover new CRISPR–Cas inhibitors as potential mechanisms of control for gene editing applications. Many anti-CRISPRs have been found that inhibit the CRISPR–Cas adaptive immune system. However, comparing all currently known anti-CRISPRs does not reveal a shared set of properties for facile bioinformatic identification of new anti-CRISPR families. Here, we describe AcRanker, a machine learning based method to aid direct identification of new potential anti-CRISPRs using only protein sequence information. Using a training set of known anti-CRISPRs, we built a model based on XGBoost ranking. We then applied AcRanker to predict candidate anti-CRISPRs from predicted prophage regions within self-targeting bacterial genomes and discovered two previously unknown anti-CRISPRs: AcrllA20 (ML1) and AcrIIA21 (ML8). We show that AcrIIA20 strongly inhibits Streptococcus iniae Cas9 (SinCas9) and weakly inhibits Streptococcus pyogenes Cas9 (SpyCas9). We also show that AcrIIA21 inhibits SpyCas9, Streptococcus aureus Cas9 (SauCas9) and SinCas9 with low potency. The addition of AcRanker to the anti-CRISPR discovery toolkit allows researchers to directly rank potential anti-CRISPR candidate genes for increased speed in testing and validation of new anti-CRISPRs. A web server implementation for AcRanker is available online at http://acranker.pythonanywhere.com/.

Document Details

Document Type
Pub Defense Publication
Publication Date
Apr 14, 2020
Source ID
10.1093/nar/gkaa219

Entities

People

  • Amina Asif
  • Anthony T. Iavarone
  • Fayyaz Ul Amir Afsar Minhas
  • Gavin J Knott
  • Jennifer Doudna
  • Kyle E Watters
  • Simon Eitzinger

Organizations

  • Defense Advanced Research Projects Agency
  • Howard Hughes Medical Institute
  • National Institutes of Health
  • National Science Foundation
  • National University of Computer and Emerging Sciences
  • Pakistan Institute of Engineering and Applied Sciences
  • University of California
  • University of California, Berkeley
  • University of Warwick
  • Yusuf Hamied Department of Chemistry

Tags

Fields of Study

  • Biology

Readers

  • Microbial Pathology
  • Molecular Genetics
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Neural Networks
  • Biotechnology
  • Biotechnology - Cancer Biotech