Comprehensive exploration of graphically defined reaction spaces

Abstract

Existing reaction transition state (TS) databases are comparatively small and lack chemical diversity. Here, this data gap has been addressed using the concept of a graphically-defined model reaction to comprehensively characterize a reaction space associated with C, H, O, and N containing molecules with up to 10 heavy (non-hydrogen) atoms. The resulting dataset is composed of 176,992 organic reactions possessing at least one validated TS, activation energy, heat of reaction, reactant and product geometries, frequencies, and atom-mapping. For 33,032 reactions, more than one TS was discovered by conformational sampling, allowing conformational errors in TS prediction to be assessed. Data is supplied at the GFN2-xTB and B3LYP-D3/TZVP levels of theory. A subset of reactions were recalculated at the CCSD(T)-F12/cc-pVDZ-F12 and ωB97X-D2/def2-TZVP levels to establish relative errors. The resulting collection of reactions and properties are called the Reaction Graph Depth 1 (RGD1) dataset. RGD1 represents the largest and most chemically diverse TS dataset published to date and should find immediate use in developing novel machine learning models for predicting reaction properties.

Document Details

Document Type
Pub Defense Publication
Publication Date
Mar 20, 2023
Source ID
10.1038/s41597-023-02043-z

Entities

People

  • Brett Savoie
  • Lawal A. Ogunfowora
  • Michael Woulfe
  • Olexandr Isayev
  • Qiyuan Zhao
  • Sai Mahit Vaddadi
  • Sanjay S. Garimella

Organizations

  • Office of Naval Research

Tags

Readers

  • Quantum Chemistry
  • Regression Analysis.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Neural Networks
  • Space