A POLYMER DATA REPOSITORY TO ENABLE DATA-DRIVEN POLYMER DISCOVERY AND SYNTHESIS

Abstract

The Materials Genome Initiative has brought about a paradigm shift in the design anddiscovery of novel materials.1 In a growing number of applications, the materials innovation cyclehas been greatly accelerated as a result of insights provided by data-driven materials informaticsplatforms.25 High-throughput computational methodologies, data descriptors and machinelearning are playing an increasingly invaluable role in research development portfolios across bothacademia and industry.68Polymers have long suffered from a lack of integrated data on electronic, mechanical,thermal, dielectric, transport, rheological, biodegradable, etc., properties across large chemicalspaces; available data is scattered in repositories with limited content, handbooks that are outdated,and the continuously growing open literature which tends to be heterogeneous, defying painstakingmanual data retrieval. Creation of an efficient and (semi-)automatic pipeline for polymer datacapture from all available sources in a continuous and sustainable manner is urgently needed.Moreover, while computational data has been systematically generated and archived for public useby many for inorganic materials, such efforts have not been undertaken in a large scale forpolymeric materials. In this proposed work, we seek to fill this gap, and plan to create the largest,continuously evolving repository for data for polymers. Such a capability will lead to a number ofobvious advantages and developments that can accelerate polymer discovery, development,optimization and deployment for a number of DOD and civilian applications. These benefitsinclude:(1) An easy-to-maintain up-to-date polymer database that can be directly queried and searched(e.g., during materials selection for a particular application);(2) Mining of the data can lead to insights on structure-property and synthesis-polymercorrelations, and a knowledge of the limits of achievable property ranges within specificchemical subclasses;(3) Surrogate (machine learning) models may be built based on the data for the rapid prediction ofthe properties of polymers not already in the database;(4) Strategies may be created for the direct design of polymers (along with the necessary synthesissteps) meeting a set of target property requirements using one of many emerging machinelearning algorithms trained on the available data.Progress has been made to a limited extent on items (2)-(3) above, but these developments will bepermanently constrained by the dataset on which all these aspects are built on. Hence, thisproposal, whose primary goals are the creation of an efficient and (semi-)automatic pipeline forneat organic and metal-containing polymer data captured in a continuous and sustainable manner.Item (1) above will be the primary deliverable, which will spawn (2)-(4) as secondary outcomes.

Document Details

Document Type
DoD Grant Award
Publication Date
Apr 29, 2020
Source ID
N000142012175

Entities

People

  • Ramamurthy Ramprasad

Organizations

  • Georgia Tech Research Corporation
  • Office of Naval Research
  • United States Navy

Tags

Readers

  • Data Mining and Knowledge Discovery.
  • Distributed Systems and Data Platform Development
  • Reinforced Composite Materials

Technology Areas

  • AI & ML
  • AI & ML - DoD AI Strategy
  • AI & ML - Machine Learning Algorithms
  • Microelectronics