Graphical User Interface for Novel Protein Generation using ProtGPT: Version 1

Abstract

This report details the Year 4 efforts of the Naval Research Laboratory (NRL) toward the algorithm development for optimization of biologic medical countermeasures. Machine/Deep Learning (ML/DL) techniques enable directed modification of antibody proteins and the foundational implications of redesigning and synthesizing small molecules with novel properties are further developed through efficient biomolecular sequence generation. Specifically, recent advances in the production of novel single-domain antibodies (sdAbs) is mainly motivated by the emergence of antibiotic-resistant bacteria which poses a perpetual challenge in antibiotic discovery. To aid in this task, we developed a graphical user interface(GUI) using an open-source, interactive application called Shiny through the R package. This application prototype (version 1) incorporates a generative pre-trained transformer (GPT) model for sdAb generation and enables the user to follow a set of cascaded steps for the input, exploration, and generation of new sequences based on an existing sdAb dataset. Summary statistics of the input and output dataset, exploratory analyses, and the proteins physicochemical characteristics are also provided. Several validation measures are also included along the sequence generation such as redundancies, antibody numbering annotations, and proper sdAb characteristics. This automated process will allow for the generation of novel antibody sequences that can subsequently be evaluated for improvements to a specifically desired pharmacokinetic profile.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 23, 2023
Accession Number
AD1208924

Entities

People

  • Jerome A. Alvarez
  • Scott N Dean

Organizations

  • United States Naval Research Laboratory

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Amino Acids
  • Anti-Bacterial Agents
  • Antibodies
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Bacteria
  • Biomolecules
  • Computer Programming
  • Deep Learning
  • Drug Therapy
  • Graphical User Interface
  • Information Science
  • Language
  • Natural Language Processing
  • Neural Networks
  • Peptides
  • Production
  • Proteins
  • Recurrent Neural Networks
  • User Interface

Readers

  • Computational Modeling and Simulation
  • Computer Science.
  • Molecular and Cellular Biochemistry

Technology Areas

  • AI & ML
  • AI & ML - Neural Networks