Molecular machine learning with conformer ensembles

Abstract

Virtual screening can accelerate drug discovery by identifying promising candidates for experimental evaluation. Machine learning is a powerful method for screening, as it can learn complex structure–property relationships from experimental data and make rapid predictions over virtual libraries. Molecules inherently exist as a three-dimensional ensemble and their biological action typically occurs through supramolecular recognition. However, most deep learning approaches to molecular property prediction use a 2D graph representation as input, and in some cases a single 3D conformation. Here we investigate how the 3D information of multiple conformers, traditionally known as 4D information in the cheminformatics community, can improve molecular property prediction in deep learning models. We introduce multiple deep learning models that expand upon key architectures such as ChemProp and SchNet, adding elements such as multiple-conformer inputs and conformer attention. We then benchmark the performance trade-offs of these models on 2D, 3D and 4D representations in the prediction of drug activity using a large training set of geometrically resolved molecules. The new architectures perform significantly better than 2D models, but their performance is often just as strong with a single conformer as with many. We also find that 4D deep learning models learn interpretable attention weights for each conformer.

Document Details

Document Type: Pub Defense Publication
Publication Date: Aug 24, 2023
Source ID: 10.1088/2632-2153/acefa7

Entities

People

Rafael Gómez-Bombarelli
Simon Axelrod

Organizations

Defense Advanced Research Projects Agency

Molecular machine learning with conformer ensembles

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas