Learning the language of viral evolution and escape

Abstract

Viral mutations that evade neutralizing antibodies, an occurrence known as viral escape, can occur and may impede the development of vaccines. To predict which mutations may lead to viral escape, Hie et al. used a machine learning technique for natural language processing with two components: grammar (or syntax) and meaning (or semantics) (see the Perspective by Kim and Przytycka). Three different unsupervised language models were constructed for influenza A hemagglutinin, HIV-1 envelope glycoprotein, and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike glycoprotein. Semantic landscapes for these viruses predicted viral escape mutations that produce sequences that are syntactically and/or grammatically correct but effectively different in semantics and thus able to evade the immune system.

Document Details

Document Type
Pub Defense Publication
Publication Date
Jan 15, 2021
Source ID
10.1126/science.abd7331

Entities

People

  • Bonnie Berger
  • Brian L Hie
  • Bryan Bryson
  • Ellen D Zhong

Organizations

  • Massachusetts General Hospital
  • Massachusetts Institute of Technology
  • National Institutes of Health
  • National Science Foundation
  • United States Department of Defense

Tags

Fields of Study

  • Biology

Readers

  • Computational Linguistics
  • Infectious Disease/Epidemiology
  • Virology (or Medical Virology).

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation
  • Biotechnology